Splunk Search

moving average query

vw5qb73
Explorer

Hello - I m collecting some user metrics in below format. customer's trVol ( transactionvolume)

2017-05-29 04:50:01,customer=ABC,trVol=2009,elapsedtimeAvg=175.7988
2017-05-29 04:50:01,customer=DEF,trVol=500,elapsedtimeAvg=50.2

Now, i need to identify those customers who is sending 50% or more transactions in last 1 hour compared to the moving average of transaction volume for last 3 weeks

Can you help me with the SPL?

sourcetype= customermetric customer!= Unknown
| streamstats window=6 global=f avg(trVol) as rollingavg by customer

Customer metric is gathered every 10 minutes.

0 Karma

DalJeanis
Legend

Okay, so you have to calculate two pieces of information -

(1) what is the customer's moving average transaction volume per unit of time for the last 3 weeks
(2) what is the customer's transaction volume for the most recent unit of time.

Your best bet for efficiency will probably be to build a summary index, but let's ignore that for a moment.

You apparently have the calculation for hourly average, so we drop in the 3-week average calculation (I think time_window works better than window for this matter) and then compare...

sourcetype= customermetric customer!= Unknown 
| streamstats time_window=3w avg(trVol) as rollingavg3w by customer
| streamstats window=6 global=f avg(trVol) as rollingavg1h by customer
| where rollingavg1h>1.5*rollingavg3w

DalJeanis
Legend

What I haven't commented on is the "50% more" part of your query. Unless your customers have VERY steady volumes, you would probably be better off calculating the 95th percentile (or something like that). Run a few runs and compare the p95 to 1.5*avg and see what the shape of your data really is.

 sourcetype= customermetric customer!= Unknown 
 | streamstats time_window=3w avg(trVol) as W3avg  p95(trVol) as W3p95  by customer
 | streamstats window=6 global=f avg(trVol) as rollingavg1h by customer
 | where rollingavg1h>W3p95
0 Karma

Ravan
Path Finder

trendline command along with timechart might work here.

Ex:

sourcetype= customermetric customer!= Unknown   earliest=-3w |bin span=1h
_time | stats sum(trVol) AS volume_hour by user,_time| trendline sma504(volume_hour) AS sma_avg|your logic to caluculate 50 % diff btw hour count vs avg |... search filters
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...