I have a list of 200+ IPs that I need to search against source addresses in our firewall data. The search needs to span several months of these logs and we consistently ingest <200 GB/day of this log type.
I have uploaded the list of 200+ IPs as a lookup table and can get the following search to run for short time-frames:
search index=firewall sourcetype="firewall" | search [|inputlookup SampleIPs.csv | fields SampleIPs | rename SampleIPs as source_address ]
But expanding it to a full day ends up taking multiple hours to complete making going back for 3+ months not feasible. Trying to span more than 1 day ends up hitting timeout errors.
Anyone have suggestions on how to better structure this type of search to run more efficiently?
@kearaspoor, The only thing i could think of is more indexer and faster disk, but there are few things you can do to improve search performance.
What the return command is doing is return a string
(10.1.3.5) OR (10.67.89.145) OR (89.76.222)
And when combined the with your base search it looks something like this
index=firewall sourcetype="firewall" (10.1.3.5) OR (10.67.89.145) OR (89.76.222)
Hope this helps.
I really thought you were on to something by removing the extracted fields. Unfortunately, the lookup returns the field/value pair rather than just the IP list so the subsearch is SampleIP= OR SampleIP= not IP or IP 😞
Use summary indexing - http://docs.splunk.com/Documentation/Splunk/6.2.4/Knowledge/Usesummaryindexing