I have combined data from two searches and want to compare them to identify what is new in the second search, what is removed from the first, and what is persistent across both searches. My data looks like:
asset event search
1 a 1st
1 a 2nd
1 b 1st
1 c 2nd
I want the results to look like
asset event status
1 a persistent
1 b removed
1 c new
How would I go about doing this? Im thinking a combination of eval with nested if statements, but really not sure if this is the best approach or how to execute.
My high-level approach to this would be:
first search
| eval from_first=1
| append [ second search | eval from_second=1 ]
| stats values(from_first) AS from_first, values(from_second) AS from_second BY asset event
| eval status=case(from_first=1 AND from_second=1, "persistent",
from_first=1 AND isnull(from_second), "removed",
isnull(from_first) AND from_second=1, "new")
| fields - from_first from_second
There may well be a more efficient way to do it if we analyze your specific search queries.
Went the route of the other answer, but this makes good sense too.
@elliotproebstel - That would work... and I'll give you some general comments about improving efficiency, since you brought it up. 😉
In general, you never want to have two searches when one will do. So this pseudocode...
first search
| eval from_first=1
| append [ second search | eval from_second=1 ]
...should be replaced whenever possible by this pseudocode...
(first search) OR (second search)
| eval from_first=case( this is from the first search, 1)
| eval from_second=case( this is from the second search, 1)
To make searches as efficient as possible, one should always get in the habit of getting rid of all unneeded fields. There aren't any in the example data, but just for good practice, there should be a fields
command there.
There is only going to be a single value or none, so values()
and max()
are equivalent. I tend to use max()
in that case, because it implicitly guarantees there will not be multiple values, and you don't ever have to look higher in the code to confirm.
(first search) OR (second search)
| fields asset event ... plus whatever I need to differentiate between searches
| eval from_first=case( this is from the first search, 1)
| eval from_second=case( this is from the second search, 1)
| stats max(from_first) as from_first max(from_second) as from_second by asset event
Technically, you don't have to create the from_* fields there, and since you only want to test existence or not, so you can use a count(eval())
in the stats instead of creating fields.
(first search) OR (second search)
| fields asset event ... plus whatever I need to differentiate between searches
| stats max(eval(test that produces a 1 if first search or null otherwise))) as found_first,
max(eval(test that produces a 1 if second search or null otherwise))) as found_second by asset event
Thank you! As usual, @DalJeanis, I learn so much from your feedback. I really appreciate you taking the time to share this.
@elliotproebstel - Well, I notice you and @kamlesh_vaghela spending a lot of time helping people out, so I decided to invest some time in y'all Dalsplaining whenever I can. @woodcock and @somesoni2 and @jkat54 and many others did that for me, so I'm passing it on, and I'm sure you will too.
By the way, get on the Splunk slack channel if you aren't already. Plenty more good stuff happening there.
Funny, I don’t remember helping much! Happy to know even if I had a small part in creating a splunk monster like you though @daljeanis.
"Dalsplaining" -- that is great.
@elliotproebstel, we love your participation on Answers! On it and will update if there's any roadblack to approval for Slack.
Thanks, @lfedak. I appreciate it!
Thanks. I did send in a request for the Splunk slack channel a few days ago, so hopefully I'll get approved soon!
Hmmm. Okay, @ellliotproebstel tag me sometime this weekend if it hasn't gotten done, and I'll rattle some cages. I thought that was mostly automatic.
@lfedak, any idea who needs to be pinged to allow this extremely helpful person into the slack fold?
^^ See comment above
Try like this(fixed typo on mvcount)
your current search with results from both searches
| table asset event search
| stats values(search) as search by asset event
| eval status=case(mvcount(search)=2,"persistent",search="1st","removed",true(),"new")
| table asset event status
Works great, thanks! Just a typo on mvcount
Oops. Fixed.
Can you post your searches?