Dynamic Threshold Calculation in splunk alert

Path Finder

I have market data feed indexing into splunk.

The logs look like following -

Security: "HDFC", FIELDS: {"PRICE", "ASK", "HIGH"}, receivedTime: <time-string>
Security "YESBANK", FIELDS= {"PRICE", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"ASK", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"PRICE"}, receivedTime: <time-string>

Security: a single value filed
FIELDS: a multi value field
receivedTime: sting, can be different from _time

  • There are close to 5000 Securities log in daily.
  • It's about 10 GB of license usage per day. So it's a large number of events.

We want to calculate the SECUIRTY:FIELD pairs that are logging less frequency than their usual input frequency.

so, for a SECURITY:FIELD pair -

diff_time = recievedTime (previous) - receivedTime (current)

this diff time varies from each SECURITY:FIELD pair from other. Some log in every second, others log in only once a day.

The challenge is to come up with an alert/alerts that dynamically calculates the optimum frequency (diff_time) for each SECURITY:FIELD pair and then compares it with it's current value.

Now let's say we assume that an optimum frequency will be the average of last 7 frequency of inputs of same SECURITY:FIELD pair.
In order calculate this value I will have to run the query for last 7 days (cause some log only once a day), and with large amount of data and use of mvexpand command, this is not viable.

How do you suggest I achieve this goal? Please suggest an algorithm for it.

  • I can't use a lookup table cause of it's size issue. A large burst of input data will bring down whole splunk if lookup table grows wildly.
0 Karma

Super Champion

Hi @iparitosh,

Your algorithm should be something like this :
1- Fetch all the data you need --> index=yourindex sourcetype=yoursourcetype filter=yourfilter
2- Make sure your multi value field is extracted, either via props/transforms or using rex command the max_match option.
more info here
3- To avoid using mvexpand for that multi-value field run a stats command to convert your data into tabular form :
...|stats values(requiredFields) as requiredFields by SECURITY,FIELD,RECEIVEDTIME
4- Once you have that table use it for calculating the delta and frequency, shouldn't be too resource intensive at this point anymore.


0 Karma


you probably need to use a dynamic outlier model. try using this -

0 Karma

Path Finder

Thank you for your response. I am reading more about it to check if it cam solve my problem.

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!