Is it possible for splunk to get an output something along the lines of:
Source:
Col_A | Col_B | Col_C
ID_A | log 1 | yes
ID_A | log 2 | no
ID_A | log 3 | no
ID_B | log 4 | no
ID_B | log 5 | no
sort Col_A
| if Col_C == yes, then search and include all rows where Col_A == ID_A
| eval to combine ID_A into one cell
(will filter away records with ID_B as all of it's Col_C == no)
Desired Result
ID_A | log 1 | yes
ID_A | log 2 | no
ID_A | log 3 | no
Am I able to use splunk for the middle logic? ->if Col_C == yes, then search and include all rows where Col_A == ID_A
hi @cboonyan,
Did you get a chance to try out either of the below answers? If so, go ahead and approve the answer that helped ya. If not, keep us updated about your issue, so users can continue to help out.
Thanks for posting!
@cboonyan,
Try this
"your search"|sort Col_A|eventstats values(Col_C) as vals by Col_A|eval found=mvfind(vals,"yes")
|where found=1|fields - vals,found
Here by using eventstats values(Col_C) as vals
, we are creating a list of distinct values of Col_C and appending to the existing results. Once we have the multivalue field "vals", we can search in that field for "yes" by using mvfind
which will output 1 (index of first matching) for matching and null for not matching result. Finally display only those rows which has found "1" and remove, "vals" and "found" from our result set.
If you need all Col_A in single cell, you could add |stats list(Col_B) as Col_B,list(Col_C) as Col_C by Col_A
to the end of the search
"your search"|sort Col_A|eventstats values(Col_C) as vals by Col_A|eval found=mvfind(vals,"yes")
|where found=1|fields - vals,found|stats list(Col_B) as Col_B,list(Col_C) as Col_C by Col_A
@cboonyan, did it work for you?
thanks @renjith.nair for your response, although it wasn't exactly what I needed, you pointed me in the right direction! on my end, somehow found=mvfind(vals,"yes") did not evaluate to 1 for some reason. In the end, with trial and error, I went with eventstats values(Col_C) as vals by Col_C | eval found=mvfind(vals,"yes") which actually evaluates to 0 for those with "yes" and a null empty field for those that are not "yes".
All in all, eventstats values(Col_C) as vals by Col_C | eval found=mvfind(vals,"yes") | where found=0 works for me.
If you do not mind, can you explain what does the by Col_A do?
Hi @cboonyan,
Let me explain that. When we do an eventstats of values(Col_C) as vals by Col_A, it creates a new column "vals" and add all distinct values of Col_C (no,yes) for each row of Col_A.
Col_A Col_B Col_C found vals
ID_A log 1 yes 1 no
yes
ID_A log 2 yes 1 no
yes
ID_A log 3 yes 1 no
yes
ID_B log 4 no no
ID_B log 5 no no
The column vals is a multi value field. Now with mvfind, we search in the vals for a "yes" and mvfind returns the index of matching starting with '0'. In our case, since the "yes" is second value in the list, it always returns 1. It's possible that you are applying this on a different dataset and not what you have given in this example. Neverthless, we can generalize it by using the below search (|where NOT isnull(found)
) instead of using 0 or 1.
"your search"|sort Col_A|eventstats values(Col_C) as vals by Col_A
|eval found=mvfind(vals,"yes")
|where NOT isnull(found)|fields - vals,found
If you are grouping by Col_C itself , it might not be correct. Let's know if you need any further assistance
| reverse
| eval _time=now()
| transaction Col_A startswith=(Col_C=yes)