I am working on searching Splunk logs for potential fraud and know that if an someone logs in to a system and then logs right back out and then in again and then out (pattern repeated several times) i would like to flag it as potential fraud.
index="weblogs" AND host=* AND (URL="login" OR URL="logout" | stats count by Host_Name, host, URL, _time | sort _time | eval time=strftime(_time, "%Y-%d-%m %H:%M:%S") | stats list(time) as time, list(host) as host, list(URL) as URL, list(Method) as method, list(ls) as ls, list(HTTP_Code) as Http_Code, list(Code) as code, list(fraud) as fraud by Host_Name | eval fraud=if((URL=="login") AND (URL=="logout"), "Possible Fraud", "No Fraud")
So this query works and gives me the results I want, but I need to tweak it so that it catches the repeating pattern and not just the anything that has a login and logout.
It may be because it 2am in the morning, but any help would be appreciated.
various options for this one. First, I notice you are not carrying the event count from teh first stats command into the second stats command. I would think that you'd want to know whether they had 3 events or ten in a given time period.
Second, you can use the bin command to chunk up the events to an interval which is more manageable. Let's say for sake of argument that you don't need the actual _time, just what 15 minute increment the suspicious activity is in.
you could use
index="weblogs" AND host=* AND (URL="login" OR URL="logout")
| bin _time as MyTime span=15m
| stats count as trancount, by Host_Name, host, URL, MyTime
Now you have the guy's activity for each host. I'm assuming Host_Name is the logon id. Let's sum up the above records with the total number of logon-logoffs in the time increment, and with a list and count of all the host-URL combinations.
| eval host_URL = host." - ".URL." - ".trancount
| stats count as typecount, dc(host) as hostcount, dc(URL) as URLcount,
sum(trancount) as sumtrancount, list(host_URL) as host_URL, by Host_Name, MyTime
So, presumably you'd test that for number of events (sumtrancount) greater than some threshold, and for presence of both Logon and Logoff (URLcount>1) if you wanted. Myself, I'd figure that more than x logons OR x logoffs in a given time frame would be suspicious, but that's your call, because you know your data.