Splunk Search

How to loop through all the values of a list, run the same search for each value, and finally combine the results?

xiangtaner
Path Finder

Hi,

Here is an example. I have a list of IP addresses and for each IP address I need to find out all the hosts assigned to it during the past 7 days. The process for finding the assigned hosts for each individual IP address is the same for every IP address in the list. How I can loop through the list and combine the individual results at the end?

Thanks and Regards,

Wayne

1 Solution

lguinn2
Legend

Try not to think in terms of loops when using Splunk - it is a hard habit to break, but it is not the paradigm that Splunk uses. Think instead of gathering all the data you need at once, and then reducing it to the statistics that you want.

It is actually even easier if you report on all IP addresses in the logs, and not just the IP addresses in a list. For example

 sourcetype=ip_assignments OR your_search_here
| stats values(host) as Host_List dc(host) as Host_Count by ip

If you really need only a list of IP addresses, I suggest that you use lookups. Put your list of IP addresses in a CSV file (for this example, let's call it ipList.csv). Load the CSV file into Splunk as a lookup table, then do this

sourcetype=ip_assignments [ inputlookup ipList.csv | fields ip ]
| stats values(host) as Host_List dc(host) as Host_Count by ip

If there are more than 100 ip addresses in the CSV file, you will probably need to approach it a little differently. You will need to create an actual lookup in addition to loading the CSV file. Make sure that the lookup has a default value set . (For this example, I will call the lookup ip_lookup, and assume that the default value is set to "unknown.")

sourcetype=ip_assignments
| lookup ip_lookup ip OUTPUT ip_result
| where ip_result != "unknown"
| stats values(host) as Host_List dc(host) as Host_Count by ip

There is a good tutorial on setting up lookups here. If you want to set up lookups by directly editing configuration files, read this. For my examples, I assumed that the CSV file would look like this

ip,ip_result
192.168.1.1,int
10.1.1.3,int
54.69.58.243,ext

It doesn't really matter what is in the "ip_result" column for my examples. You could have additional columns as well.

View solution in original post

lguinn2
Legend

Try not to think in terms of loops when using Splunk - it is a hard habit to break, but it is not the paradigm that Splunk uses. Think instead of gathering all the data you need at once, and then reducing it to the statistics that you want.

It is actually even easier if you report on all IP addresses in the logs, and not just the IP addresses in a list. For example

 sourcetype=ip_assignments OR your_search_here
| stats values(host) as Host_List dc(host) as Host_Count by ip

If you really need only a list of IP addresses, I suggest that you use lookups. Put your list of IP addresses in a CSV file (for this example, let's call it ipList.csv). Load the CSV file into Splunk as a lookup table, then do this

sourcetype=ip_assignments [ inputlookup ipList.csv | fields ip ]
| stats values(host) as Host_List dc(host) as Host_Count by ip

If there are more than 100 ip addresses in the CSV file, you will probably need to approach it a little differently. You will need to create an actual lookup in addition to loading the CSV file. Make sure that the lookup has a default value set . (For this example, I will call the lookup ip_lookup, and assume that the default value is set to "unknown.")

sourcetype=ip_assignments
| lookup ip_lookup ip OUTPUT ip_result
| where ip_result != "unknown"
| stats values(host) as Host_List dc(host) as Host_Count by ip

There is a good tutorial on setting up lookups here. If you want to set up lookups by directly editing configuration files, read this. For my examples, I assumed that the CSV file would look like this

ip,ip_result
192.168.1.1,int
10.1.1.3,int
54.69.58.243,ext

It doesn't really matter what is in the "ip_result" column for my examples. You could have additional columns as well.

xiangtaner
Path Finder

Thanks Lguinn, this is helpful!

Now, there are actually two additional challenges for my situation:

  1. My major search is a multisearch, i.e. | multisearch [search source1] [search source2] ... [search source5]. I believe that it will be much faster or efficient if I put the list of IPs in each subsearch, plus there is a 50000 rows limit for subsearch if I do not filter the sources first. Then the question is how could I put the lookup table in each subsearch?

  2. The list of IPs is generated real-time on the fly from a parameter passed in. Then how could I integrate the list of IPs with the major search using the lookup table? Do I need to separate the process into two pieces of code, i.e. firstly outputlookup iplist.csv in the first piece and then inputlookup it in the second piece? Could I combine them in one block of code?

Thanks and Regards,

Wayne

0 Karma

lguinn2
Legend

Why can't you search this way? Why do you need multisearch?

(source=source1 crit="value1") OR (source=source2) OR (source=source3 crit2="valueX")  [ inputlookup ipList.csv | fields ip ]
| ...
0 Karma

xiangtaner
Path Finder

Thanks!

But what if the IPs from source3 need to be generated from "rex"? Can we do "rex" in (source=source3 ...), i.e. (source=source3 | rex "...(?\S...*)"?

Appreciate your help!

0 Karma

lguinn2
Legend

You can't do rex in the search, but you can do it in the subsequent statements. Remember that rex will only extract the field in events that match the regular expression - events that don't match will be unchanged, and that means that any existing field definitions (such as for the ip field) will be preserved.

So you could do

(source=source1 crit="value1") OR (source=source2) OR (source=source3 crit2="valueX")  [ inputlookup ipList.csv | fields ip ]
| rex "pattern part 1 (?<ip>\d+\.\d+\.\d+\.\d+) pattern part 2"
| ...
0 Karma

xiangtaner
Path Finder

The IP from source3 is actually generated from "rex". So the "rex" part has to be before inputlookup part. MuS suggested the following way and it works combining with subsearches.

search sourcetype=source1 | rex "...(?\S*)..." | search [ | inputlookup host2ips.csv | fields ip]

Thanks!

lguinn2
Legend

Ah I see - that is a better way to address the problem

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Using the Splunk Threat Research Team’s Latest Security Content

REGISTER HERE Tech Talk | Security Edition Did you know the Splunk Threat Research Team regularly releases ...

SplunkTrust | 2024 SplunkTrust Application Period is Open!

It's that time again, folks! That's right, the application/nomination period for the 2024 SplunkTrust is ...