In our system, every visit consist of one or more actions. Every action has its name and in Splunk it's a field named "transId". Every time an action triggered, it has an unique sequence and in Splunk it's a field named "gsn". A customer has its unique id and in Splunk it's a field named "uid". During the period of a customer visit our system, he has an unique session id and in Splunk it's a field named "sessionId". If we want to locate a complete operation of a user, we need to use uid and sessionId together. Like many other systems, the order of actions in our system is fixed, under normal circumstances.
We want to create an alter to monitor the abnormal order of actions. For example, an important action named "D", it is at the last of an action-chain. Under normal circumstances, you must access our system by the order of actions "A B C D". But some hackers my skip the trans B, which may be an action that verify his identity. The problem is I don't know the command to get abnormal results. We can accept that we need to input the order of actions for every action-chain. It's better to read the order by configuration file.
| stats count by sessionId uid transId gsn _time
| sort 0 sessionId uid _time
I can get every use's order of actions by this command.
Can you give me some advice? If you want to get more information, you can ask me here.
Best wishes!
A quick and non-scientific approach might just count the number of distinct transId values per sessionId and alert on unexpected count values:
| stats earliest(_time) as _time dc(transId) as dc_transId by sessionId
| where dc_transId!=4
You can also use algorithms from Splunk Machine Learning Toolkit. For example, this search associates a set of training transId values with a sequence value by sessionId and fits the data to a random forest classifier:
| sort 0 _time
| streamstats count as sequence by sessionId
| fit AutoPrediction transId from sequence into Jackiifilwhh_behavior_model
Th sequence field is similar to your gsn field but local rather than global.
You can subsequently monitor live data with the assumption that all sessions in the time range are complete. Partial sessions will generate false positives:
| sort 0 _time
| streamstats count as sequence by sessionId
| apply Jackiifilwhh_behavior_model
| where 'predicted(transId)'!=transId