Re: Grouped Search Results by IP

sdwilkerson · ‎06-07-2012

I am building a small visual app to assist cyber-security analysts.

They have an automated process to identify "SOIs" (Systems of Interest).
I created a lookup that includes these IPs and EPOCH_TIME. They want to automate the initial reconnaissance process to save them timing having to manually collect this info for their investigations.

For each IP on the list, they would like about 10 searches completed.
They have a simple form now which does this. You enter the IP and it retrieves the data from all 10 searches. However, this is ad-hoc and they want this new app to run overnight to collect this report data for each IP considered an SOI, so it is ready for the analysts when they arrive in the AM.

They want the results of all of that to remain in Splunk for at least 3 days so it can be queried with loadjob via SID, but probably displayed as part of this app.

So, how do I merry-up the 10 SIDs that were used to query a given IP address and distinguish when one of the searches was used on one IP from another? Since I am using a sub-search to inputlookup the soi_list lookuptable, the only unique identifier we had (the IP address) is not in the search expression and therefore I don't see it if I search the audit logs. Searching audit, I can't tell which search was for one IP and which was for another.

Does anyone have any good ideas on what I can do to find all of the SIDS that queried a specific IP?
Is it possible for a search to be aware of its own SID? If so, then I could append the SID to my lookup table and wouldn't have to search auditd at all.
Is there a smarter way to do batch processing of savedsearches against a group of IPs, than what I described here?

Thanks,

Sean

ganesh11 · ‎08-29-2012

You can find the geographic look-up of the ipaddress using
the site http://www.ip-details.com/ . It gives
the ip address, its location details such as country,
region/state,city,ISP(Internet Service provider) details etc.

gkanapathy · ‎06-07-2012

I have a few suggestions:

Don't just run the search and then use loadjob to get the results. Instead, use outputcsv and inputcsv. outputcsv will let you specify a specific file name, and note that you can use a subsearch to fill in the argument to create a file name, though you will have to be a little tricky with it (... | outputcsv [ inputlookup ip | head 1 | eval addr="srch1".addr." at ".now().".csv" return $addr ]). You'll have to figure out
Just run the searches from an external script. I can't imagine if you're kicking of 10 searches for some number of IP addresses or whatever, that it should be kicked off from the Splunk UI, but even if it is, simply have that call a custom search command that as a side effect runs the scripts. Or just have a totally different page/interface for launching scripts
If you use and external script, you can launch searches using the REST API, and that will return the SID of the job.

I guess the thing is, I'm not sure that it makes sense to try to do this use case entirely in the search language.

BTW, you can add the SID to search results with the addinfo search command. This adds it as a field to every result though.

sdwilkerson · ‎06-07-2012

Thanks Gerald. Since 4.0 came out I believed inputcsv/outputcsv were on the way out and I try to use inputlookup and outputlookup instead. Do you expect them to be around now? That would be nice.
Why is loadjob not preferred? Due to efficiency or complication of having to ensure the results are still available, etc.?
An external script occurred to me, but I was trying to keep it all managed in the product and not rely on the filesystem. Still a consideration, thank you.
Right now I am getting the SID via rest calls in search. It is an extra step though.
Addinfo... (slaps forehead, right).

sideview · ‎06-07-2012

How complex are the searches, and how overlapping are the fields on which they operate? You could theoretically do one scheduled super search that collects the superset of field data across all the hosts. Then load this into a custom list view that displays the list of hosts, and have those drilldown clicks go to a soi_detail view. Which would load the most recent saved job for the 'super search', select the drilled-down-on host from a Pulldown module, and slice the correct stats out for each of the 10 correct final-searches, and the selected host, all with postProcess.

IF the searches are somewhat overlapping, and the super-search doesn't have completely degenerate performance, with Sideview Utils this wouldn't be wildly difficult.

On the other hand, if the searches are each doing structurally different things, like doing qualitatively different transaction commands, or doing complex things with multivalue fields, then the postprocess approach quickly gets too crazy.

sdwilkerson · ‎06-07-2012

Nick. Thanks. Interesting idea. The searches are mostly easy/basic. Unfortunately, they might add new SOI items throughout the day so we need to do periodic one-offs. The megasearch might not be good then.

Grouped Search Results by IP

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

Grouped Search Results by IP

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...