Splunk Search

proper use/function of 'set intersect'

rgonzale6
Path Finder

I have an index where events contain a source IP and a URL destination field. I would like to construct a query that would show commonality in events. I would like to search multiple IPs and have my search return only URLs that have been contacted by all of those IPs. I had constructed my search like so, for only two IPs:

set intersect [search index=INDEX_NAME Internal_IP=IPPADDR1 | fields URL ] [search index=INDEX_NAME Internal_IP=IPADDR2 | fields URL]|fields URL

By my thinking, this would return only the URL fields where there was commonality found between the results of the two subsearches here. It's not working.

I can do a subsearch that does this easily enough when it's only two hosts...but in practice, I will need the results for far more than two hosts. Here's my subsearch-based solution for two hosts, which works well:

[search index=INDEX_NAME Internal_IP=IPADDR1 | fields URL] index=INDEX_NAME Internal_IP=IPADDR2 | fields URL | top URL

Thanks!

Tags (1)
1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

Using set isn't going to be the most efficient way to solve this problem.

I'd use stats to look at the source IP characteristics for each url like:

index=INDEX_NAME | stats values(Internal_IP) as Internal_IPs dc(Internal_IP) as Internal_IP_count by URL

You can then pipe the results of this to | search Internal_IP_count > <threshold> to see the URLs that were accessed by more than <threshold> IPs as well as the IPs that accessed them.

View solution in original post

ziegfried
Influencer

Just an idea for a different approach:

index=INDEX_NAME (Internal_IP=IPADDR1 OR Internal_IP=IPADDR2) | stats dc(Internal_IP) as ip_count by URL | where ip_count>1

which would reduce the events to those with IPs you're interested in before computing number of distinct ip addresses per url, and then filtering the results to only those that has been accessed by all IPs. It can easily expanded to more IPs:

index=INDEX_NAME (Internal_IP=IPADDR1 OR Internal_IP=IPADDR2 OR Internal_IP=IPADDR3) | stats dc(Internal_IP) as ip_count by URL | where ip_count>2

rgonzale6
Path Finder

thanks! Appreciate your response.

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

Using set isn't going to be the most efficient way to solve this problem.

I'd use stats to look at the source IP characteristics for each url like:

index=INDEX_NAME | stats values(Internal_IP) as Internal_IPs dc(Internal_IP) as Internal_IP_count by URL

You can then pipe the results of this to | search Internal_IP_count > <threshold> to see the URLs that were accessed by more than <threshold> IPs as well as the IPs that accessed them.

rgonzale6
Path Finder

thanks! Much appreciated.

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...