Splunk Search

How to efficiently query all indexes for a list of IPs

asearson
Explorer

BACKGROUND: My Disaster Recovery team is compiling a list of all IPs endpoints, and has requested that I query all of my Splunk Events (in all Indexes) for anything resembling an IP. I created the following search, which works under my smaller-Staging Splunk-Enterprise, but fails out when I attempt it in my larger-Production Splunk-Enterprise:

index="*" earliest=-1d@d latest=-0d@d
| rex field=_raw "(?<ip>\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)"
| stats values(ip)

As a workaround to avoid the timeout, I've split the Production search into multiple searches of each Index.

QUESTIONS:

  1. Is there a more efficient way to get the IPs my DR wants?
  2. If there an efficient way to Join the results of the the multiple Index searches in Prod?
0 Karma

bowesmana
SplunkTrust
SplunkTrust

I'm assuming the regex is fine, as you seem happy with that, so in terms of efficiency, if this is a one-off operation, does efficiency matter?

Your query is searching yesterday. Is the intention that it searches further back than that? Could you just run a backfill operation and let Splunk handle the scheduling?

If you're looking for a general solution, then you could output each production index search to a CSV (outputlookup append=t) and then after running all the searches, just inputlookup the csv and stats count on the data.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi asearson,
I cannot check your regex because you didn't shared an example so i take it as good.
Anyway, for the list all the IPs you should use dedup and table commands:

index="*" earliest=-1d@d latest=-0d@d
| rex "(?<ip>\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)"
| dedup ip
| sort ip
| table ip

I have only one doubt: you want all the IPs of all indexes, but different sourcetype have usually different log formats, so how do you think to extract IPs with one regex from all sourcetypes?

Maybe you could use a different approach:
for servers, you could use nslookup to extract IPs from the DNS passing hostnames in this way:

index=_internal
| dedup host
| lookup nslookup clienthost AS host OUTPUT clientip
| sort host
| table host clientip

For appliances with standard syslog, you can extract IPs using an appropriate regex because it's always in the same site.
Appliances that haven't standard syslog usually have the IP in the hostname.

Ciao.
Giuseppe

asearson
Explorer

Thanks for the reply, but not exactly the answer I'm looking for...

CLARIFICATION OF MY PROBLEM STATEMENT:
I need to capture every IP found in all logs, regardless of Index/host/source/sourcetype. A single weblog from a busy webserver could yield 1000's of IPs for each unique client requesting a popular webpage. I'm not concerned about Hostnames.

CLARIFICATIONS TO YOUR QUESTIONS:
Example is anything between 0.0.0.0 and 255.255.255.255.
Regex taken from www.regular-expressions.info/ip.html and verified with regex101.com

The idea for "rex field=_raw" is taken from this:
https://answers.splunk.com/answers/656616/how-to-extract-ip-address-using-regex.html
It is applying to every RAW event, regardless of sourcetype or log format.

TESTING:
I tested your pipeline "| dedup ip | sort ip | table ip" , and job-inspector shows that it actually takes longer than the single "| stats values(ip)" pipe. They yield the same results, with slightly different sort (string rather than Integer)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

sorting is a bad idea, 'sort' without '0' will truncate at the sort limit (default 10000)

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud’s AI Assistant in Action Series: Analyzing and ...

This is the second post in our Splunk Observability Cloud’s AI Assistant in Action series, in which we look at ...

Elevate Your Organization with Splunk’s Next Platform Evolution

 Thursday, July 10, 2025  |  11AM PDT / 2PM EDT Whether you're managing complex deployments or looking to ...

Splunk Answers Content Calendar, June Edition

Get ready for this week’s post dedicated to Splunk Dashboards! We're celebrating the power of community by ...