Alerting

Dynamic Threshold Calculation in splunk alert

iparitosh
Path Finder

I have market data feed indexing into splunk.

The logs look like following -

Security: "HDFC", FIELDS: {"PRICE", "ASK", "HIGH"}, receivedTime: <time-string>
Security "YESBANK", FIELDS= {"PRICE", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"ASK", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"PRICE"}, receivedTime: <time-string>

Security: a single value filed
FIELDS: a multi value field
receivedTime: sting, can be different from _time

  • There are close to 5000 Securities log in daily.
  • It's about 10 GB of license usage per day. So it's a large number of events.

We want to calculate the SECUIRTY:FIELD pairs that are logging less frequency than their usual input frequency.

so, for a SECURITY:FIELD pair -

diff_time = recievedTime (previous) - receivedTime (current)

this diff time varies from each SECURITY:FIELD pair from other. Some log in every second, others log in only once a day.

The challenge is to come up with an alert/alerts that dynamically calculates the optimum frequency (diff_time) for each SECURITY:FIELD pair and then compares it with it's current value.

Now let's say we assume that an optimum frequency will be the average of last 7 frequency of inputs of same SECURITY:FIELD pair.
In order calculate this value I will have to run the query for last 7 days (cause some log only once a day), and with large amount of data and use of mvexpand command, this is not viable.

How do you suggest I achieve this goal? Please suggest an algorithm for it.

  • I can't use a lookup table cause of it's size issue. A large burst of input data will bring down whole splunk if lookup table grows wildly.
0 Karma

DavidHourani
Super Champion

Hi @iparitosh,

Your algorithm should be something like this :
1- Fetch all the data you need --> index=yourindex sourcetype=yoursourcetype filter=yourfilter
2- Make sure your multi value field is extracted, either via props/transforms or using rex command the max_match option.
more info here https://docs.splunk.com/Documentation/Splunk/7.2.6/SearchReference/Rex
3- To avoid using mvexpand for that multi-value field run a stats command to convert your data into tabular form :
...|stats values(requiredFields) as requiredFields by SECURITY,FIELD,RECEIVEDTIME
4- Once you have that table use it for calculating the delta and frequency, shouldn't be too resource intensive at this point anymore.

Cheers,
David

0 Karma

Sukisen1981
Champion

you probably need to use a dynamic outlier model. try using this - https://docs.splunk.com/Documentation/MLApp/4.2.0/User/DNOlegacyassist

0 Karma

iparitosh
Path Finder

Thank you for your response. I am reading more about it to check if it cam solve my problem.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...