Splunk Search

How do I optimize my search to return results faster?

Motivator

So here's my workflow.

  1. I have a request from an outside source that wants me to scrub my data for certain IP addresses.
  2. I get this data, and I scrub it to turn into a .csv for a lookup
  3. I put this lookup in Splunk, and run a search against.

My problem is that I'm searching dense data over 180 days, and it takes a few hours to run a search, AND the values in the .csv change weekly.

Currently I'm attempting to use a data model, though data model's don't allow | where filters.

The search I have is pretty easy:

index=network sourcetype=fgt_traffic 
| where dest_ip==match_address 
| stats count by match_address

All I'm looking for is the number of hits, and the address that is matching, but I need to be able to be able to get results much faster than a few hours.

I mentioned I'm trying to use a data model, but I was wondering if tstats/tscollect would be a better way to summarize this information, or maybe a KV Store store somehow?

i'm thinking I need a summary index of just dest_ip's per event within my index, so that I can quickly do a comparison to a lookup file that will change weekly. Any thoughts would be helpful.

Thank you!

0 Karma

Communicator

Did you try the TERM() keyword? The magic behind it is described in this session for example: Indexed Tokens and you

I was able to greatly improve searches for single IP addresses with TERM(). It's working even better when the logs you are searching in do not have quotes (") around the value of destip and have field=value in the _raw data. Because you can then do something like TERM(destip=192.168.1.1).

You can even use the inputlookup @masonmorales suggested, when you know how to format the output of a subsearch:

index=network sourcetype=fgt_traffic [|inputlookup match_address
| eval search="TERM(".match_address.") OR "
| stats values(search) as search 
| nomv search
| eval search="(".rtrim(search, " OR ").")"]

Edit: Insert " OR " between terms.

0 Karma

Influencer

Maybe something like this?

 index=network sourcetype=fgt_traffic  [|inputlookup match_address | rename match_address as dest_ip]
| stats count by dest_ip

Using an inputlookup as a subsearch creates a boolean AND between the base search and the contents of the subsearch. The key is to match the field names.

0 Karma

Motivator

this will give me the results, I agree, though it's not just the results i'm looking for, it's the 3 hour timespan it takes to get results from 180 days I'm attempting to effect.

0 Karma

SplunkTrust
SplunkTrust

Is match_address just one IP?

0 Karma

Motivator

match address will be a series of IP address in the lookup file.

The lookup file looks like this:

match_address
10.10.10.1
10.43.56.22
157.46.8.2

etc..

0 Karma

SplunkTrust
SplunkTrust

Since, the data model searches doesn't allow filters and your lookup changes weekly, I would setup a summary index to calculate the count more frequently (hour/day) and then use the summary index for your report.

0 Karma