I am attempting to write a search that can alert if a user deviates from some normal data viewing pattern. The event log in question records every time a user sees a bit of information, identified by the cID. Sometimes they view the same cID multiple times per day, but I only care about the distinct number they view in some time period. Ultimately, I would like to determine the average number of unique cIDs each user views over some time period (maybe daily, maybe weekly) so that I can look for exceptions and trigger an alert automatically.
So if userA views 150 unique cIDs on average each day (over a 30 day span), and one day they view 400 unique cIDs, I would like an alert to be triggered. I have looked at the "anomalies", "delta", and "outlier" commands, but can't seem to get a working search. I am working on a search that takes the avg(dc(cID)) by username, but that seems to be a dead end due to some Splunk restrictions. I'm not set on using avg() as the determining parameter, I just need something that can detect anomalous behavior.
Anyone have a better approach?
Thanks - I'll look into those. The problem I hit with most of these commands is that I am trying to apply them to distinct_count(cID), rather than take the average or trendline of the cID values themselves. cID are unique identifiers, so they have no numeric meaning.
That works to get the current average for the timeframe, but I need to compare it to the most recent day's count to know if I need to generate an alert. So if I take the average of the last 8 days (earliest=-8d@d latest=-2d@d) I need to compare that average to the DC from earliest=-1d@d so that I can determine the diff from normal.
I would summary index the distinct count of cID values and make sure the user field is also indexed. From there, you should be able to run a "stats range" search against the cID returned which will give you the daily difference. Finally, run a search against the output of the "stats range" that is greater than the level you want to trigger upon. So in search language, maybe this:
Save this search to summary index every night (also save the count_cID as a field):
sourcetype=event_log | sistats dc(cID) as count_cID by user
Run this search every 24+ hours to check the change (using a difference of +-100:
index=summary search_name=<above_saved_search> | stats range(count_cID) as cID_change by user | search cID_change > 100
Yes. A search with
sistats (just like a search with plain
stats) needs to setup to enable summary indexing. (The "si" prefix commands don't magically feed any data to the summary index. They are just indented to be more summary-index friendly commands.)