Splunk Search

Searching intersection of two subsets

Path Finder

Hello Splunkers,

Thanks to visit my question.

I have two subsets of data related to each other.

  1. The set A consists of 50 items. Of these, only top 5 items interests me (subset AA'),assessed a condition.
  2. The set B contains 25 items. Of these, only top 5 items interests me (subset BB'),assessed a condition.

My goal is:

  1. search for the top 5 elements of AA ', once assessed a condition.
  2. search for the top 5 elements of B, once assessed a required condition.
  3. Count how many intersections have the elements of AA ' with each of the top 5 elements of BB'.
  4. (subsearch AA') AND (subsearch BB') gives different results if (subsearch BB') AND (subsearch AA')

Can someone kindly suggest me how to build that search using only the subsearches?

Thanks in advance.

Nik (currently working on splunk 4.0.9)

0 Karma
1 Solution

Splunk Employee
Splunk Employee

There is a way to do this in Splunk using the set commands ( http://www.splunk.com/base/Documentation/latest/SearchReference/Set ) , but it is likely that there is a far more efficient way to do it without the set command or a bunch of subsearches. (By filtering the data all at once, and then doing ... | top 5 by subsetname | stats count by commonfield, but whether this is possible depends on the specific conditions and the data).

In general, it may be a good idea to give examples or describe what your incoming data is and what you want out, rather that specifying the specific algorithm, because a lot of set-based/table-based algorithms may be replaced with better algorithms that take better advantage of how Splunk processes data.

View solution in original post

Splunk Employee
Splunk Employee

There is a way to do this in Splunk using the set commands ( http://www.splunk.com/base/Documentation/latest/SearchReference/Set ) , but it is likely that there is a far more efficient way to do it without the set command or a bunch of subsearches. (By filtering the data all at once, and then doing ... | top 5 by subsetname | stats count by commonfield, but whether this is possible depends on the specific conditions and the data).

In general, it may be a good idea to give examples or describe what your incoming data is and what you want out, rather that specifying the specific algorithm, because a lot of set-based/table-based algorithms may be replaced with better algorithms that take better advantage of how Splunk processes data.

View solution in original post