Splunk Search

moving average query

vw5qb73
Explorer

Hello - I m collecting some user metrics in below format. customer's trVol ( transactionvolume)

2017-05-29 04:50:01,customer=ABC,trVol=2009,elapsedtimeAvg=175.7988
2017-05-29 04:50:01,customer=DEF,trVol=500,elapsedtimeAvg=50.2

Now, i need to identify those customers who is sending 50% or more transactions in last 1 hour compared to the moving average of transaction volume for last 3 weeks

Can you help me with the SPL?

sourcetype= customermetric customer!= Unknown
| streamstats window=6 global=f avg(trVol) as rollingavg by customer

Customer metric is gathered every 10 minutes.

0 Karma

DalJeanis
Legend

Okay, so you have to calculate two pieces of information -

(1) what is the customer's moving average transaction volume per unit of time for the last 3 weeks
(2) what is the customer's transaction volume for the most recent unit of time.

Your best bet for efficiency will probably be to build a summary index, but let's ignore that for a moment.

You apparently have the calculation for hourly average, so we drop in the 3-week average calculation (I think time_window works better than window for this matter) and then compare...

sourcetype= customermetric customer!= Unknown 
| streamstats time_window=3w avg(trVol) as rollingavg3w by customer
| streamstats window=6 global=f avg(trVol) as rollingavg1h by customer
| where rollingavg1h>1.5*rollingavg3w

DalJeanis
Legend

What I haven't commented on is the "50% more" part of your query. Unless your customers have VERY steady volumes, you would probably be better off calculating the 95th percentile (or something like that). Run a few runs and compare the p95 to 1.5*avg and see what the shape of your data really is.

 sourcetype= customermetric customer!= Unknown 
 | streamstats time_window=3w avg(trVol) as W3avg  p95(trVol) as W3p95  by customer
 | streamstats window=6 global=f avg(trVol) as rollingavg1h by customer
 | where rollingavg1h>W3p95
0 Karma

Ravan
Path Finder

trendline command along with timechart might work here.

Ex:

sourcetype= customermetric customer!= Unknown   earliest=-3w |bin span=1h
_time | stats sum(trVol) AS volume_hour by user,_time| trendline sma504(volume_hour) AS sma_avg|your logic to caluculate 50 % diff btw hour count vs avg |... search filters
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

 Prepare to elevate your security operations with the powerful upgrade to Splunk Enterprise Security 8.x! This ...

Get Early Access to AI Playbook Authoring: Apply for the Alpha Private Preview ...

Passionate about security automation? Apply now to our AI Playbook Authoring Alpha private preview ...

Reduce and Transform Your Firewall Data with Splunk Data Management

Managing high-volume firewall data has always been a challenge. Noisy events and verbose traffic logs often ...