Splunk Search

comparing two searches and calculate the differences

smanojkumar
Communicator

Hi There!

   I would like to find the values of host that were in macro 1 but not in macro 2
search 1

 

`macro 1`
| fields host

 



search 2

 

`macro 2`
| fields host

 



macro 1

host
a
b
c
d

macro 2

host
a
b
e
f

Result

Count - 2
because host c and d were not in macro 2


Thanks in Advance!

Labels (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Do you mean to only count difference and not showing host values?  In that case, add count.

 

`macro 1`
| eval source = "macro1"
| append
    [search `macro 2`
    | eval source = "macro2"]
| stats values(source) as source by host
| where mvcount(source) < 2 AND source == "macro 1"
| stats dc(host) as count_diff

 

If you don't get desired results, you need to examine data and post relevant data from each macro (anonymize as needed) as well as actual results.

Here is an emulation based on your original mock data.

macro 1macro 2
| makeresults format=csv data="host
a
b
c
d"
| makeresults format=csv data="host
a
b
e
f"

Adding them together (because search command is not used, the subsearch looks different)

 

| makeresults format=csv data="host
a
b
c
d"
| eval source="macro 1"
| append
    [makeresults format=csv data="host
a
b
e
f"
    | eval source="macro 2"]
| stats values(source) as source by host
| where mvcount(source) < 2 AND source == "macro 1"

 

This gives

hostsource
cmacro 1
dmacro 1

If I add | stats dc(host), it gives me 2.

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

First, let me clarify that this problem is solvable as stated.  But you may want to reconsider how "macro 1" and "macro 2" is structured to make this easier.  You may also want to structure a different search to make this function more efficient.

Back to the stated problem.  The idea is to "tag" output from each macro, then count which host is in which output.

`macro 1`
| eval source = "macro1"
| append
    [search `macro 2`
    | eval source = "macro2"]
| stats values(source) as source by host
| where mvcount(source) < 2 AND source == "macro 1"

Note I insert "search" command in the subsearch because I do not know how "macro 2" is constructed.  It may not need that, or the "search" command may ruin it.  The where command also uses a feature/side effect of SPL's equality comparator against multivalue fields.

Tags (2)
0 Karma

smanojkumar
Communicator

Hi @yuanliu ,

   The code in macro is just the index name and source type and with several fields, as our need is only host, so we need that,

   same code in macro 2 as well, only the source type varies

macro 1
index=sap sourcetype=1A*

maceo 2
index=sap sourcrtype=2A*

Thanks in Advance!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

In that case, the search I gave should work as is.  Have you tried?

0 Karma

smanojkumar
Communicator

Hi @yuanliu ,

   Few results were wrong, I don't know why!
  When I'm checking with in search, It is having macro1 as value but when I'm checking with that macro, there we are having that host.

Thanks in Advance!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Do you mean to only count difference and not showing host values?  In that case, add count.

 

`macro 1`
| eval source = "macro1"
| append
    [search `macro 2`
    | eval source = "macro2"]
| stats values(source) as source by host
| where mvcount(source) < 2 AND source == "macro 1"
| stats dc(host) as count_diff

 

If you don't get desired results, you need to examine data and post relevant data from each macro (anonymize as needed) as well as actual results.

Here is an emulation based on your original mock data.

macro 1macro 2
| makeresults format=csv data="host
a
b
c
d"
| makeresults format=csv data="host
a
b
e
f"

Adding them together (because search command is not used, the subsearch looks different)

 

| makeresults format=csv data="host
a
b
c
d"
| eval source="macro 1"
| append
    [makeresults format=csv data="host
a
b
e
f"
    | eval source="macro 2"]
| stats values(source) as source by host
| where mvcount(source) < 2 AND source == "macro 1"

 

This gives

hostsource
cmacro 1
dmacro 1

If I add | stats dc(host), it gives me 2.

smanojkumar
Communicator

Hi @yuanliu ,

   I think append had some limits to append it, so if we are handling with large volumes of data set like 2lakhs events, which command will be useful!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

   I think append had some limits to append it, so if we are handling with large volumes of data set like 2lakhs events, which command will be useful!

This is why my first reply hinted that restructuring the searches could be a better option.  You did mention each of the two macros were simple index searches with different sourcetype constraints.  It is much more efficient to combine the two searches into one, then stats over their differences.  If you observe how the search I constructed before had to manufacture a field named "source" (which obviously is not your data field named source), you would draw a parallel.

Based on pseudo code of your macros, here is an example of what you can use instead:

index=sap sourcetype IN (1A*, 2A*)
| eval sourcetype = if(match(sourcetype, "^1A"), "1A...", "2A...")
| stats values(sourcetype) as sourcetype by host
| where mvcount(sourcetype) < 2 AND sourcetype == "1A..."
| stats dc(host) as count_diff
0 Karma
Get Updates on the Splunk Community!

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...