Archive

How to compare the same search from the previous day to current day using set diff?

darrenaefc
Engager

Hi guys,

I am very new to Splunk (about 1 month or so) and I am having some trouble incorporating "set diff" into my search to compare the same search from previous day to now. Can anyone help??
Below is my search used to retrieve the data i am looking for i am unsure where and how "set diff" should be used.

index=security sourcetype= (blank) source="myfiles.csv"
| dedup displayName actions | table displayName actions
| rex field=actions "hosts\W+(?P.*?)]" max_match=0
| table displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts
Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Using set diff, it has to be done like this (two subsearches, one selecting data for yesterday and other selecting data for today)

| set diff [search index=security sourcetype= (blank) source="myfiles.csv" earliest=-1d@d latest=@d
| dedup displayName actions | table displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts]  [search index=security sourcetype= (blank) source="myfiles.csv" earliest=@d latest=now
| dedup displayName actions | table displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts]

But I would rather do it using below, better search (no subsearch, selecting data for yesterday and today)

index=security sourcetype= (blank) source="myfiles.csv" earliest=-1d@d  latest=now
| bucket span=1d _time
| dedup _time displayName actions | table _time displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table _time displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts
| stats dc(_time) as daysReported by displayName Hosts | where daysReported=1 | table displayName Hosts 

View solution in original post

woodcock
Esteemed Legend

I would not use set diff for many reasons; try this:

index=security sourcetype= (blank) source="myfiles.csv" earliest=-2d@d latest=@d
| dedup displayName actions
| fields displayName actions
| rex field=actions max_match=0 "hosts\W+(?P.*?)]"
| fields -actions
| mvexpand Hosts
| makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts
| bin _time span=1d
| eval _time = if(_time < relative_time(now(), "-1d@d"), "YesterYesterDay", "YesterDay")
| chart values(Hosts) OVER displayName BY _time
| nomv YesterYesterDay 
| nomv YesterDay 
| rex field=YesterYesterDay mode=sed "s/[\r\n\s]+/;/g" 
| rex field=YesterDay mode=sed "s/[\r\n\s]+/;/g" 
| eval setdiff = split(replace(replace(replace(replace(mvjoin(mvsort(mvappend(split(replace(YesterYesterDay, "(;|$)", "#1;"), ";"), split(replace(YesterDay, "(;|$)", "#0;"), ";"))), ";"), ";(\w+)#0\;\1#1", ""), ";\w+#1", ""), "#0", ""), ";(?!\w)|^;", ""), ";")
| makemv delim=";" YesterYesterDay
| makemv delim=";" YesterDay

See this run-anywhere example:

| tstats values(sourcetype) AS sourcetype WHERE index=_* earliest=-2d@d latest=@d BY host _time span=1d
| eval _time = if(_time < relative_time(now(), "-1d@d"), "YesterYesterDay", "YesterDay") 
| chart values(sourcetype) OVER host BY _time 
| nomv YesterYesterDay 
| nomv YesterDay 
| rex field=YesterYesterDay mode=sed "s/[\r\n\s]+/;/g" 
| rex field=YesterDay mode=sed "s/[\r\n\s]+/;/g" 
| eval setdiff = split(replace(replace(replace(replace(mvjoin(mvsort(mvappend(split(replace(YesterYesterDay, "(;|$)", "#1;"), ";"), split(replace(YesterDay, "(;|$)", "#0;"), ";"))), ";"), ";(\w+)#0\;\1#1", ""), ";\w+#1", ""), "#0", ""), ";(?!\w)|^;", ""), ";")
| makemv delim=";" YesterYesterDay
| makemv delim=";" YesterDay

martin_mueller
SplunkTrust
SplunkTrust

It manually calculates the difference between two multivalue fields.

woodcock
Esteemed Legend

Attention @darrenaefc, I had a mistake in my original answer. It works properly now.

0 Karma

woodcock
Esteemed Legend

Which was the core of OPs ask.

0 Karma

woodcock
Esteemed Legend

You can thank @martin_mueller for that setdiff line.

0 Karma

macadminrohit
Contributor

@woodcock, whats this last eval setdiff is doing here ?

0 Karma

woodcock
Esteemed Legend

It had a bug and wasn't working right. Try it now.

0 Karma

somesoni2
Revered Legend

Using set diff, it has to be done like this (two subsearches, one selecting data for yesterday and other selecting data for today)

| set diff [search index=security sourcetype= (blank) source="myfiles.csv" earliest=-1d@d latest=@d
| dedup displayName actions | table displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts]  [search index=security sourcetype= (blank) source="myfiles.csv" earliest=@d latest=now
| dedup displayName actions | table displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts]

But I would rather do it using below, better search (no subsearch, selecting data for yesterday and today)

index=security sourcetype= (blank) source="myfiles.csv" earliest=-1d@d  latest=now
| bucket span=1d _time
| dedup _time displayName actions | table _time displayName actions
| rex field=actions "hosts\W+(?P<Hosts>.*?)]" max_match=0
| table _time displayName Hosts | mvexpand Hosts | makemv delim="," Hosts | rex mode=sed field=Hosts "s/'/ /g"
| mvexpand Hosts
| stats dc(_time) as daysReported by displayName Hosts | where daysReported=1 | table displayName Hosts 

View solution in original post

Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!