I want to get the last index of my target value for a multi-value field. For example,
id | chain |
1 | SendMessage CheckMessage PayForIt |
2 | CheckMessage SendMessage CheckMessage PayForIt |
3 | PayForIt SendMessage CheckMessage PayForIt |
4 | SendMessage PayForIt CheckMessage |
If "PayForIt" appears, meanwhile "SendMessage" and "CheckMessage" appears before it, this is a normal event. But if "SendMessage" or "CheckMessage" don't appear, or after "PayForIt", it is a abnormal event. It means you must send message and be verified by SMS before you pay for something!
The id 1, 2 and 3 above are normal, 4 is abnormal.
I've tried mvfind like below, but it will treat 2, 3 as abnormal event!
eval send=mvfind(chain,"SendMessage")
| eval check=mvfind(chain,"CheckMessage")
| eval pay=mvfind(chain,"PayForIt")
| where isnotnull(pay) and isnotnull(check) and isnotnull(send) and pay>check and check>send
This restated question is much clearer about your intentions and logic. What you need is to iterate "SendMessage" over "CheckMessage" using foreach to compare with the index of "PayForIt". Like this:
| foreach SendMessage CheckMessage
[ eval abnormal = if(mvfind(chain, "<<FIELD>>") > mvfind(chain, "PayForIt") OR NOT in(chain, "<<FIELD>>") OR abnormal == "yes", "yes", "no") ]
followed by a where command.
This is an emulation that you can play with and compare with your real data.
| makeresults
| eval _raw ="id chain
1 SendMessage,CheckMessage,PayForIt
2 CheckMessage,SendMessage,CheckMessage,PayForIt
3 PayForIt,SendMessage,CheckMessage,PayForIt
4 SendMessage,PayForIt,CheckMessage
5 SendMessage,PayForIt"
| multikv forceheader=1
| eval chain = split(chain, ",")
| table id chain
``` data emulation above ```
id | chain | abnormal |
1 | SendMessage CheckMessage PayForIt | no |
2 | CheckMessage SendMessage CheckMessage PayForIt | no |
3 | PayForIt SendMessage CheckMessage PayForIt | yes |
4 | SendMessage PayForIt CheckMessage | yes |
5 | SendMessage PayForIt | yes |
Thanks a lot! Actually, a customer could pay for some thing for many times, so the field "chain" in Splunk will consist of several "PayForIt". I need to check the every part of it, which split by "PayForIt". I also need to get the last index of "SendMessage" or "CheckMessage" before every "PayForIt". It seems that "mvfind" won't work properly, because it doesn't get the last but the first index.
That is why you always need to give all critical details about data. Just as you described, we'll split at "PayForIt" before proceeding further.
| eval chain = mvmap(chain, replace(chain, "PayForIt", "PayForIt:")) ``` add marker for split ```
| eval chain = split(mvjoin(chain, ","), ":") ``` split right after PayForIt ```
| mvexpand chain ``` each PayForIt in its own event ```
| eval chain = split(chain, ",")
| where isnotnull(chain)
| foreach SendMessage CheckMessage
[ eval abnormal = if(mvfind(chain, "<<FIELD>>") > mvfind(chain, "PayForIt") OR NOT in(chain, "<<FIELD>>") OR abnormal == "yes", "yes", "no") ]
You get something like
id | chain | abnormal |
1 | SendMessage CheckMessage PayForIt | no |
1 | SendMessage CheckMessage PayForIt | no |
2 | CheckMessage SendMessage CheckMessage PayForIt | no |
3 | PayForIt | yes |
3 | SendMessage CheckMessage PayForIt | no |
4 | SendMessage PayForIt | yes |
4 | CheckMessage | yes |
5 | SendMessage PayForIt | yes |
Emulated data are the following:
id | chain |
1 | SendMessage CheckMessage PayForIt SendMessage CheckMessage PayForIt |
2 | CheckMessage SendMessage CheckMessage PayForIt |
3 | PayForIt SendMessage CheckMessage PayForIt |
4 | SendMessage PayForIt CheckMessage |
5 | SendMessage PayForIt |
This is used for the emulation:
| makeresults
| eval _raw ="id chain
1 SendMessage,CheckMessage,PayForIt,SendMessage,CheckMessage,PayForIt
2 CheckMessage,SendMessage,CheckMessage,PayForIt
3 PayForIt,SendMessage,CheckMessage,PayForIt
4 SendMessage,PayForIt,CheckMessage
5 SendMessage,PayForIt"
| multikv forceheader=1
| eval chain = split(chain, ",")
| table id chain
``` data emulation above ```
Does something like this help?
| streamstats count as row
| mvexpand chain
| streamstats count by row chain
| stats max(count) as lastindex by row chain
Thank you! I've modified my question, it seems that I can't expand chain to a single value field!
Why can't you mvexpand chain?
Also, how is chain created in the first place - can you tag the occurrences of each chain type before creating the multivalue field?
Well, let me describe the generation of field "chain".
A user's access to my system is based on a unique session_ id(same name's field in Splunk). And the whole access can be divided into many transactions, named trans_id in Splunk. So I can use the command below to get a user's path of transactions in time order, which named "chain".
stats count by session_id trans_id _time
| eval lux=_time+"|"+trans_id
| stats values(lux) as chain by session_id
| eval chain=mvmap(chain,mvindex(split(chain,"|"),1))
Above is the process of generating field "chain". In a word, I want to find the abnormal event, which means that the order is incorrect or the value is incomplete. Just like a customer must be verified by SMS before pay something.
Thanks a lot!
Your generation of chain is essentially deduping _time|trans_id and putting them in time order in a multivalue field.
You could try something like this
| dedup _time trans_id session_id
| sort 0 _time
| streamstats count as sequence_number global=f by session_id
| eval SendMessageSequence = if(trans_id=="SendMessage", sequence_number, null())
| eval CheckMessageSequence = if(trans_id=="CheckMessage", sequence_number, null())
| eval PayForItSequence = if(trans_id=="PayForIt", sequence_number, null())
| stats max(CheckMessageSequence) as lastCheckMessage max(SendMessageSequence) as lastSendMessage min(PayForItSequence) as firstPayForIt by session_id
| eval good=if(firstPayForIt > lastSendMessage AND firstPayForIt > lastCheckMessage, "OK", "NOK")
Thanks a lot! I've tried this command, it works well! But when a customer pay for something several times, the filed "chain" in Splunk will contain several "PayForIt". It seems that I split "chain" into several parts by "PayForIt" and check the value in every part, which would work! If it is a Java program, I will iterate the multifield or a string to check every part, but how can I make it in Splunk?
Assuming you want to check for every PayForIt there is a Send and Check, try something like this
| dedup _time trans_id session_id
| sort 0 -_time
| streamstats count(eval(trans_id=="PayForIt")) as payments global=f by session_id
| sort 0 session_id _time
| streamstats count as sequence_number global=f by session_id payments
| eval SendMessageSequence = if(trans_id=="SendMessage", sequence_number, null())
| eval CheckMessageSequence = if(trans_id=="CheckMessage", sequence_number, null())
| eval PayForItSequence = if(trans_id=="PayForIt", sequence_number, null())
| stats max(CheckMessageSequence) as lastCheckMessage max(SendMessageSequence) as lastSendMessage min(PayForItSequence) as firstPayForIt by session_id payments
| eval good=if(firstPayForIt > lastSendMessage AND firstPayForIt > lastCheckMessage, "OK", "NOK")