Alerting

Dynamic Threshold Calculation in splunk alert

iparitosh
Path Finder

I have market data feed indexing into splunk.

The logs look like following -

Security: "HDFC", FIELDS: {"PRICE", "ASK", "HIGH"}, receivedTime: <time-string>
Security "YESBANK", FIELDS= {"PRICE", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"ASK", "HIGH"}, receivedTime: <time-string>
Security: "HDFC", FIELDS: {"PRICE"}, receivedTime: <time-string>

Security: a single value filed
FIELDS: a multi value field
receivedTime: sting, can be different from _time

  • There are close to 5000 Securities log in daily.
  • It's about 10 GB of license usage per day. So it's a large number of events.

We want to calculate the SECUIRTY:FIELD pairs that are logging less frequency than their usual input frequency.

so, for a SECURITY:FIELD pair -

diff_time = recievedTime (previous) - receivedTime (current)

this diff time varies from each SECURITY:FIELD pair from other. Some log in every second, others log in only once a day.

The challenge is to come up with an alert/alerts that dynamically calculates the optimum frequency (diff_time) for each SECURITY:FIELD pair and then compares it with it's current value.

Now let's say we assume that an optimum frequency will be the average of last 7 frequency of inputs of same SECURITY:FIELD pair.
In order calculate this value I will have to run the query for last 7 days (cause some log only once a day), and with large amount of data and use of mvexpand command, this is not viable.

How do you suggest I achieve this goal? Please suggest an algorithm for it.

  • I can't use a lookup table cause of it's size issue. A large burst of input data will bring down whole splunk if lookup table grows wildly.
0 Karma

DavidHourani
Super Champion

Hi @iparitosh,

Your algorithm should be something like this :
1- Fetch all the data you need --> index=yourindex sourcetype=yoursourcetype filter=yourfilter
2- Make sure your multi value field is extracted, either via props/transforms or using rex command the max_match option.
more info here https://docs.splunk.com/Documentation/Splunk/7.2.6/SearchReference/Rex
3- To avoid using mvexpand for that multi-value field run a stats command to convert your data into tabular form :
...|stats values(requiredFields) as requiredFields by SECURITY,FIELD,RECEIVEDTIME
4- Once you have that table use it for calculating the delta and frequency, shouldn't be too resource intensive at this point anymore.

Cheers,
David

0 Karma

Sukisen1981
Champion

you probably need to use a dynamic outlier model. try using this - https://docs.splunk.com/Documentation/MLApp/4.2.0/User/DNOlegacyassist

0 Karma

iparitosh
Path Finder

Thank you for your response. I am reading more about it to check if it cam solve my problem.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...