Splunk Search

Timechart with Timewrap and upper and lower bounds

gabenav11
Explorer

Hello,

I'm looking to create a query that is a timechart that timewraps every week, for x number of weeks, showing the count of a given field on a variable span within a given set of time on a certain day of the week with upper and lower bounds as overlay lines that are averaged for that hour across all the weeks in the time chart.

I know that sounds super generic, so for example, the chart could say show me the
-average http_response_time every 5 minutes
-from 2-3pm on every Tuesday
-timewrapped weekly over the last 10 weeks
-with upper and lower bounds, say 25th and 75th percentile overlays, where the overlays are the average response time for that hour averaged out across all weeks, to check trends over the entire period

Here is what I have come up with so far, using the eventgen app to generate test data:

index=main /booking 
| eval hour=tonumber(strftime(_time,"%H")) 
| eval year=tonumber(strftime(_time,"%Y")) 
| eval month=tonumber(strftime(_time,"%m")) 
| eval dayOfWeek=strftime(_time, "%A") 
| where (and here can day day and hour to whatever)
| timechart avg(http_response_time) span=1hour
| timewrap 1day series=relative
| append 
    [ search index=main /booking 
    | fields http_response_time 
    | timechart avg(http_response_time) as SecondAverage 
    stdev(http_response_time) as StandardDeviation
    perc75(http_response_time) as upper_control_limit
    perc25(http_response_time) as lower_control_limit span=1d
    | eval lowerBound=(SecondAverage-StandardDeviation), upperBound=(SecondAverage+StandardDeviation) 
    | fields _time lowerBound upperBound upper_control_limit lower_control_limit SecondAverage]
| eval hour=strftime(_time,"%H") 
| table hour *

part of the issue with the above is the overlay displays next to the averages, like this:
https://imgur.com/a/YxregbJ

any help is appreciated,

Thanks!

0 Karma
1 Solution

skoelpin
SplunkTrust
SplunkTrust

I've built this exact solution and it's been in production for over a year. There's a ton of complexities and gotcha's here. Is this for a single entity or for multiple entities? If the ladder, i'd strongly suggest you push this data to a summary index for faster queries and better usability. As for your bounds, you will need to add some eval's which calculate the offset from the baseline. I will share my SPL tomorrow

View solution in original post

skoelpin
SplunkTrust
SplunkTrust

I've built this exact solution and it's been in production for over a year. There's a ton of complexities and gotcha's here. Is this for a single entity or for multiple entities? If the ladder, i'd strongly suggest you push this data to a summary index for faster queries and better usability. As for your bounds, you will need to add some eval's which calculate the offset from the baseline. I will share my SPL tomorrow

gabenav11
Explorer

@skoelpin thanks! were you able to find your spl to share?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Here ya go. Your base search is going to need a transformational command to pipe into this, I used stats as you can see.. This is spliting by entities represent customer and looks at a sliver in time in the past. Say if its noon on Tuesday, it will grab the last 5 noon's @ Tuesday in the last 5 weeks and average them together to create the baseline. You then create an upper and lower bounds by selecting a confidence interval (I used 90% below). I have this as a daily search which populates a summary index and have another scheduled search which runs every 10 minutes that overlays this with the actual counts for short feedback loops to measure accuracy. This will also alert when the actual count goes below the lowest threshold. Much faster time to value in ITSI

index=....
| stats count by customer, _time 
| eval w=case( (_time>relative_time(now(), "+1d@d-5w-30m") AND _time<=relative_time(now(), "+1d@d-5w+1d+30m")), 5, (_time>relative_time(now(), "+1d@d-4w-30m") AND _time<=relative_time(now(), "+1d@d-4w+1d+30m")), 4, (_time>relative_time(now(), "+1d@d-3w-30m") AND _time<=relative_time(now(), "+1d@d-3w+1d+30m")), 3, (_time>relative_time(now(), "+1d@d-2w-30m") AND _time<=relative_time(now(), "+1d@d-2w+1d+30m")), 2, (_time>relative_time(now(), "+1d@d-1w-30m") AND _time<=relative_time(now(), "+1d@d-1w+1d+30m")), 1) 
| eval shift=case(isnotnull(w),"+"+w+"w-30m,+"+w+"w-20m,+"+w+"w-10m,+"+w+"w-0m,+"+w+"w+10m,+"+w+"w+20m,+"+w+"w+30m,") 
| where isnotnull(shift) 
| makemv delim="," shift 
| mvexpand shift 
| eval time=relative_time(_time,shift) 
| eventstats avg(count) AS pred by time, customer 
| eval upper=if(count>pred,count,pred) 
| eval lower=if(count<pred,count,pred) 
| eval lower=if(lower=0,"",lower) 
| eventstats avg(count) AS pred, stdev(count) as pred_stdev, by time, customer 
| eval upper=if(upper>pred+1*pred_stdev,pred_stdev*0.5+pred,upper) 
| eval lower=if(lower <pred-1*pred_stdev,pred_stdev*0.5+pred, lower) 
| stats avg(count) AS pred, stdev(upper) AS ustdev, stdev(lower) AS lstdev stdev(count) as stdev by time, customer 
| eval low=pred-lstdev*(sqrt(1/(1-90/100))) 
| eval low=if(low<0, 1, low) 
| eval high=pred+ustdev*(sqrt(1/(1-90/100))) 
| eval _time=time 
| timechart span=10m useother=false limit=0 cont=false min(pred) as pred , min(low) as low, min(high) as high by customer 
| makecontinuous _time 
0 Karma

skoelpin
SplunkTrust
SplunkTrust

@gabenav11 did this answer your question? If so can you please accept it? It took a decent amount of time to grab this SPL, sanitize it, and post it..

0 Karma

gabenav11
Explorer

yes, thank you very much! I clicked accept on the answer you gave? but i cannot click accept on a response? do you see that it says accept?

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Hey @ppablo_splunk can you help out @gabenav11 ? He's unable to accept the answer

0 Karma

gabenav11
Explorer

oh there we go i accepted now, i thought that "[checkmark] accepted" meant that it was accepted but actually that meant that i had not yet accepted and "[checkmark] unaccept" means it's accepted lol

Get Updates on the Splunk Community!

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...