Splunk Search
Highlighted

get source file names containing a specific value without search through every event within

Path Finder

I need to get the source names of files that contain a specific value. The search is taking a long time because each contains millions of lines. The value I am searching for is repeated on each line. Is there a way to take one line per source to accelerate the search?

Thanks,


The is my search

eventtype=perf | stats first(customer) as cust by source | search cust=$customerToken$ | sort -_time | rex field=source "(?<sourcename>([^/]).[^.]$)" | fields source sourcename

The customer token is selected in a drop down

Here is some data:

<date>;<requestType>;<requestTime>;<customer>;<version>

Values:

2014-06-11;com.ws.rich.content.adapter.RichContentAdapterPortType.getMultipleTextContent; 3; pv-1;1.2.5

2014-06-11;com.ws.distribution.reservation.ReservPortType.quote;186;pv-1;1.2.5

2014-06-11;com.ws.rich.content.adapter.RichContentAdapterPortType.getMultipleTextContent;3;pv-1;1.2.5

2014-06-11;com.ws.sales.book.BookingPortType.createBooking;773;pv-1;1.2.5

Goes on for millions of lines in each source file

Each source file contains the same customer

It is really a matter of finding all source files that contain a specific customer without searching every file line by line

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

SplunkTrust
SplunkTrust

Do post your search and some sample data - then we might see a way to speed things up.

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

Path Finder

Ok. Sample data and search posted.

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

SplunkTrust
SplunkTrust

If each source file only contains one customer then you can get rid of loading all events before the stats like this:

eventtype=perf cust="$customerToken$" | stats count by source | rex field=source "(?<sourcename>([^/]).[^.]$)" | fields source sourcename

That way you'll only load sources that contain the customer you're looking for.

If that doesn't work quickly you could define a summary index or a lookup that extracts one event for every search as soon as it's added to your Splunk and have your search run off that. See http://blogs.splunk.com/2011/01/11/maintaining-state-of-the-union/ for an example.

Edit:

If your number of source files is low then you can do this:

| metadata type=sources index=yourindex | map maxsearches=yournumberofsources search="eventtype=perf cust=\"$customerToken$\" source=$$source$$ | head 1" | rex field=source "(?<sourcename>([^/]).[^.]$)" | fields source sourcename

However, I remember there being some bug around the different layers of dollar tokens - one is for the form value, one is for the map value. The dashboard may get confused there. If you can get it to work then this should be blazingly fast because of the head.

View solution in original post

Highlighted

Re: get source file names containing a specific value without search through every event within

Path Finder

Preliminary tests indicate that this does not speed up the search. It still takes upwards of thirty seconds to find 3 source files out of a total of 10. I will look into the summary index as soon as I have a bit of spare time.
Thanks anyway

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

SplunkTrust
SplunkTrust

If your number of source files is low then you can do this:

| metadata type=sources index=yourindex | map maxsearches=yournumberofsources search="eventtype=perf cust=\"$customerToken$\" source=$$source$$ | head 1" | rex field=source "(?<sourcename>([^/]).[^.]$)" | fields source sourcename

However, I remember there being some bug around the different layers of dollar tokens - one is for the form value, one is for the map value. The dashboard may get confused there. If you can get it to work then this should be blazingly fast because of the head.

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

Path Finder

The last comment worked, you can convert it to an answer and I will accept it

0 Karma
Highlighted

Re: get source file names containing a specific value without search through every event within

SplunkTrust
SplunkTrust

Great. I've added that to the answer.

0 Karma