Splunk Search

check if current usage breached last highest value by a given percentage over multiple sources

splunek
Engager

Hi.
I'm a splunk newbie and I am trying to construct a query over multiple sources that will do a sum of points over a day in 1 hr buckets and tell me whether in the past hour I have breached the max bucket size by a factor of 10%. I have tried to do this via a subsearch, which is very slow and I can't get it to quite work, or as one query, where I got the counts, but can't figure out how to get it to tell me whether it breached the max.

Here is what I have so far:

query returning pts for each event| bin _time span=1h as hour | stats sum(pts) as sum_pts by hour  sourcetype

Any help at all would be greatly appreciated!

Tags (1)
0 Karma

lguinn2
Legend

Try this

query returning pts for each event earliest=-24h@h latest=@h
| bin _time span=1h as hour 
| eval thisHour = if (hour <= relative_time(now(),"-1h@h"),0,1)
| stats sum(pts) as sum_pts by hour sourcetype thisHour
| stats max(sum_pts) as maxPts by sourcetype thisHour
| eval pts = if(thisHour==1,maxPts,0)
| eval maxPts = if(thisHour==0,maxPts,0)
| stats sum(maxPts) as maxPts sum(pts) as PtsThisHour by sourcetype
| where PtsThisHour >= (maxPts * 1.1) 

This could probably be shortened, but it will be fast. Here is a line-by-line explanation

1- search - I added the earliest and latestbecause the search should to start and end on hour boundaries to be accurate.

2 - eval - sets a field that will be 1 if the data is from the last hour, and 0 otherwise. This lets us avoid a subsearch or join

3 - stats - Add up the points for each sourcetype by hour AND whether or not the thisHour is set. We should still have 24 results for each sourcetype.

4 - stats - Figure out the max points for each sourcetype. Including thisHour means we will get two results for each sourcetype - one result which reflects just the sum for the past hour, and the other result which contains the maximum of the other 23 hours.

4 & 5 - eval - These two statements effectively move the "maxPts" for the the past hour into its own field named "pts"

6 - stats - collapses the two results for each sourcetype into a single result

7 - where - excludes the results for sourcetypes that don't meet the criteria

HTH!

lguinn2
Legend

Sorry about the typos - I updated my answer for future reference.

Thanks for sticking with it!

0 Karma

splunek
Engager

Here was the final version that worked for me, for reference:

query returning pts for each event earliest=-24h@h latest=@h
| bin _time span=1h as hour
| eval thisHour = if (hour <= relative_time(now(),"-1h@h"),1,0)
| stats sum(pts) as sum_pts by hour sourcetype thisHour
| stats max(sum_pts) as maxPts by sourcetype thisHour
| eval pts = if(thisHour==0,maxPts,0)
| eval maxPts = if(thisHour==1,maxPts,0)
| stats sum(maxPts) as maxPts sum(pts) as PtsThisHour by sourcetype
| where PtsThisHour >= (maxPts * 1.1)

0 Karma

splunek
Engager

Ok, there were just a couple of typos. One of the sourcetypes was spelled sourceType, and the maxPts and PtsThisHour were reversed, but other than that, works like a charm! Thank you so much!

0 Karma

splunek
Engager

Thanks, but I'm still having some trouble. I needed to put the "-1h@h" in quotes or it wouldn't evaluate, and also, it seems to break at step 4, at which point I get no results. I can't seem to fix it.

0 Karma

splunek
Engager

I've been trying to edit the original text of the question to make it clearer, but I can't get past the captcha...

0 Karma

splunek
Engager

That's not exactly what I'm looking for, as I'm actually looking for a specific log line that returns # of points processed per query, and that is the sum I am looking at, although I guess I can used index volume as a proxy for that, but that wouldn't be very exact.

0 Karma

lukejadamec
Super Champion

This is a much faster way to get index volume by sourcetype:
index=_internal source=*metrics.log group=per_sourcetype_thruput series!=_* | eval totalGB = (kb/1024)/1024 | bin _time span=1h as hour |convert ctime(hour) as Hour| stats sum(totalGB) as totalGB by Hour series |rename series AS Sourcetype
I'm not sure how to compare one hour to the next, but I know it can be done.

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...