Solved: Re: Searching intersection of two subsets

nik_splunk · ‎05-01-2010

Hello Splunkers,

Thanks to visit my question.

I have two subsets of data related to each other.

The set A consists of 50 items. Of these, only top 5 items interests me (subset AA'),assessed a condition.
The set B contains 25 items. Of these, only top 5 items interests me (subset BB'),assessed a condition.

My goal is:

search for the top 5 elements of AA ', once assessed a condition.
search for the top 5 elements of B, once assessed a required condition.
Count how many intersections have the elements of AA ' with each of the top 5 elements of BB'.
(subsearch AA') AND (subsearch BB') gives different results if (subsearch BB') AND (subsearch AA')

Can someone kindly suggest me how to build that search using only the subsearches?

Thanks in advance.

Nik (currently working on splunk 4.0.9)

gkanapathy · ‎05-02-2010

There is a way to do this in Splunk using the set commands ( http://www.splunk.com/base/Documentation/latest/SearchReference/Set ) , but it is likely that there is a far more efficient way to do it without the set command or a bunch of subsearches. (By filtering the data all at once, and then doing ... | top 5 by subsetname | stats count by commonfield, but whether this is possible depends on the specific conditions and the data).

In general, it may be a good idea to give examples or describe what your incoming data is and what you want out, rather that specifying the specific algorithm, because a lot of set-based/table-based algorithms may be replaced with better algorithms that take better advantage of how Splunk processes data.

View solution in original post

gkanapathy · ‎05-02-2010

There is a way to do this in Splunk using the set commands ( http://www.splunk.com/base/Documentation/latest/SearchReference/Set ) , but it is likely that there is a far more efficient way to do it without the set command or a bunch of subsearches. (By filtering the data all at once, and then doing ... | top 5 by subsetname | stats count by commonfield, but whether this is possible depends on the specific conditions and the data).

In general, it may be a good idea to give examples or describe what your incoming data is and what you want out, rather that specifying the specific algorithm, because a lot of set-based/table-based algorithms may be replaced with better algorithms that take better advantage of how Splunk processes data.

Searching intersection of two subsets

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Automated Threat Analysis: Available in ES Premier

What’s New in Splunk AI: Volume 02

Best Practices: Splunk auto adjust pipeline queue

Join the Conversation