- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi Splunkers,
I need to create a small app to monitor activities with data inside corporate application like who, when, where from, what was the action etc. The sourcetype (based on JSON) will not correlate with other datasets (at least by now or not much) and has not more than 15-20 fields.
The volumes of data are not extremely high.
So what is the best choice to process this data with ability to detect anomalies in user activities.
As of now I like the idea to collect statistical values (like dc, avg, values by user or by data) in lookup tables and run some real-time savedsearches that correlates with tables and looks for unusual values.
Does it make sense to have accelerated reports for this case, will this increase the user experience while analyzing the data for long terms (year - 2)? And maybe it could make easier to use anomaly detection capabilities of Splunk SPL with accelerated datatsets.
Thank you for any advise.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You should leave it as is until you're ready to correlate it. Without requirements you're just wasting time.
In the end I would recommend summary indexes though. Lookups can become unwieldy when they grow too large and summary indexes are super fast.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You should leave it as is until you're ready to correlate it. Without requirements you're just wasting time.
In the end I would recommend summary indexes though. Lookups can become unwieldy when they grow too large and summary indexes are super fast.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

There's no strict requirements except ability to detect unusual behavior and most of rules are not too complicated in myvision .
The lookups will be limited by hundreds or several thousands rows with max, avg, stdev mode
etc values for several fields and some of values
option for another 1 or 2 by a particular factor or group (e.g. max(date_hour) avg(date_hour) mode(action) sum(action) ... by user, file
). And having this it's pretty fast to evaluate the deviations OR search NOT
comparisons.
So beyond the fastness of summary indexes is there any benefits? Like I'm afraid the apporach with lookups may not provide the ability to use commands for anomaly detection and predictions in full scope, however it's not a super big data scopes.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I've never seems UBA case where 2 years of data was anything but HUGE data. But again, you should wait until you have the full requirements, understand exactly what you'll be correlating with, and then decide how to proceed. You're just wasting cycles currently.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Sorry for my persistence and probably incomplete explanation, but here's exact example - detect if the user connects on unusual time and downloads more data than usual (exceeds avg*2 or max count per hour).
I have csv list #1 with user
avg_count_per_hour
max_count_per hour
, csv list #2 with user
values(work_hours)
. The search correlates current result for last hour with these 2 csv.
And there's not much of cases like this each with 2 or 3 conditions. The count of events is approx 1mln per month (hundreds of mb).
Does it make sense to turn on accelearations or summaries?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

24mln events (2 years * 1mln/month) isn't very much. You probably don't even need accelerations or summaries if you built according to Splunk recommended hardware specs.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I'd create a report for each of your use cases and if they run slow, accelerate them.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

OK, I'll follow your recommendations, thank you!
