All Apps and Add-ons

Most efficient approach to process data for app

evelenke
Contributor

Hi Splunkers,

I need to create a small app to monitor activities with data inside corporate application like who, when, where from, what was the action etc. The sourcetype (based on JSON) will not correlate with other datasets (at least by now or not much) and has not more than 15-20 fields.
The volumes of data are not extremely high.
So what is the best choice to process this data with ability to detect anomalies in user activities.
As of now I like the idea to collect statistical values (like dc, avg, values by user or by data) in lookup tables and run some real-time savedsearches that correlates with tables and looks for unusual values.

Does it make sense to have accelerated reports for this case, will this increase the user experience while analyzing the data for long terms (year - 2)? And maybe it could make easier to use anomaly detection capabilities of Splunk SPL with accelerated datatsets.

Thank you for any advise.

Tags (1)
0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust

You should leave it as is until you're ready to correlate it. Without requirements you're just wasting time.

In the end I would recommend summary indexes though. Lookups can become unwieldy when they grow too large and summary indexes are super fast.

View solution in original post

0 Karma

jkat54
SplunkTrust
SplunkTrust

You should leave it as is until you're ready to correlate it. Without requirements you're just wasting time.

In the end I would recommend summary indexes though. Lookups can become unwieldy when they grow too large and summary indexes are super fast.

0 Karma

evelenke
Contributor

There's no strict requirements except ability to detect unusual behavior and most of rules are not too complicated in myvision .
The lookups will be limited by hundreds or several thousands rows with max, avg, stdev mode etc values for several fields and some of values option for another 1 or 2 by a particular factor or group (e.g. max(date_hour) avg(date_hour) mode(action) sum(action) ... by user, file). And having this it's pretty fast to evaluate the deviations OR search NOT comparisons.
So beyond the fastness of summary indexes is there any benefits? Like I'm afraid the apporach with lookups may not provide the ability to use commands for anomaly detection and predictions in full scope, however it's not a super big data scopes.

0 Karma

jkat54
SplunkTrust
SplunkTrust

I've never seems UBA case where 2 years of data was anything but HUGE data. But again, you should wait until you have the full requirements, understand exactly what you'll be correlating with, and then decide how to proceed. You're just wasting cycles currently.

0 Karma

evelenke
Contributor

Sorry for my persistence and probably incomplete explanation, but here's exact example - detect if the user connects on unusual time and downloads more data than usual (exceeds avg*2 or max count per hour).
I have csv list #1 with user avg_count_per_hour max_count_per hour, csv list #2 with user values(work_hours). The search correlates current result for last hour with these 2 csv.
And there's not much of cases like this each with 2 or 3 conditions. The count of events is approx 1mln per month (hundreds of mb).
Does it make sense to turn on accelearations or summaries?

0 Karma

jkat54
SplunkTrust
SplunkTrust

24mln events (2 years * 1mln/month) isn't very much. You probably don't even need accelerations or summaries if you built according to Splunk recommended hardware specs.

0 Karma

jkat54
SplunkTrust
SplunkTrust

I'd create a report for each of your use cases and if they run slow, accelerate them.

0 Karma

evelenke
Contributor

OK, I'll follow your recommendations, thank you!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...