Splunk Search

How to count based upon relative_time ?

sjringo
Contributor

I think what I am trying to do is relatively easy ?
I want to query looking back -8 hours then count the # of events that are in a specific 4 hour window.

index=anIndex sourcetype=aSourceType aString earliest=-481m latest=-1m
| eval aTime2 = _time
| eval A = if (aTime2 > relative_time(now(),"-241m@m") AND aTime2 < relative_time(now(),"-1m@m"),(A+1),A)
| table A, aTime2



I would also want a count for the next sliding 4 hr window (-300m to -60m), there are few more but just trying to figure out the first one for now.

I was expecting my variable "A" to show how many of my matched events occur within the first 4 hr period but its empty ?

Am I going about this incorrectly, not seeding "A" with a 0 start value ?

What am I missing ?

 

Labels (3)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

Here's one way of doing it 

 

index=anIndex sourcetype=aSourceType aString earliest=-481m@m latest=-1m@m
| eval age=now() - _time
| eval age_ranges=split("1,6,11,31,61,91,121,241",",")
| foreach 0 1 2 3 4 5 6 7 [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), z=mvappend(z, zone) ]
| stats count by z

 

what this is effectively doing is set up the base age, which are your right hand minute values. Then as each gap is 4 hours, (14400 seconds) use a foreach loop to go round each of the 8 age bands and see if the age is in that band.

The output is a multivalue field that contains the bands it is found it.

Then stats count by, will count for each band, so that should give you the counts in each of your bands.

Does this give you what you want

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

Basic search for doing this is

index...
| eval isInWindow = if (_time > relative_time(now(),"-241m@m") AND _time < relative_time(now(),"-1m@m"),1,0)
| stats sum(isInWindow) as A

which sets isInWindow to be 1 or 0 depending on whether it's in or out then just summing the field.

As for calculating sliding windows, streamstats is a way to do that, but you could also just do maths to set various counters using the same relative_time logic and then sum those counters.

There are other ways, but it depends on what you want to do with that

0 Karma

sjringo
Contributor

I have made some progress and this is where I am at.

index=anIndex sourcetype=aSourceType aString earliest=-481m latest=-1m
| eval aWindow = case ( (_time > relative_time(now(),"-241m@m") AND (_time < relative_time(now(),"-1m@m"))),1,
(_time > relative_time(now(),"-246m@m") AND (_time < relative_time(now(),"-6m@m"))),2,
(_time > relative_time(now(),"-251m@m") AND (_time < relative_time(now(),"-11m@m"))),3,
(_time > relative_time(now(),"-271m@m") AND (_time < relative_time(now(),"-31m@m"))),4,
(_time > relative_time(now(),"-301m@m") AND (_time < relative_time(now(),"-61m@m"))),5,
(_time > relative_time(now(),"-331m@m") AND (_time < relative_time(now(),"-91m@m"))),6,
(_time > relative_time(now(),"-361m@m") AND (_time < relative_time(now(),"-121m@m"))),7,
(_time > relative_time(now(),"-481m@m") AND (_time < relative_time(now(),"-241m@m"))),8,
true(),9)
| stats count by aWindow



but I have realized that using a case statement allows each log event to exist in one window, when the windows overlap and one log event can exist in more than one window ?
I am working on a dashboard for 8 widgets that currently do the exact same query, just a different window.
So I am trying to make one query that has all data for the calculation, then in the widget(s) previously mentioned use $query.1$ to retrieve the result from the base reusable query.

So, how to I handle counting in these overlapping windows ?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Here's one way of doing it 

 

index=anIndex sourcetype=aSourceType aString earliest=-481m@m latest=-1m@m
| eval age=now() - _time
| eval age_ranges=split("1,6,11,31,61,91,121,241",",")
| foreach 0 1 2 3 4 5 6 7 [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), z=mvappend(z, zone) ]
| stats count by z

 

what this is effectively doing is set up the base age, which are your right hand minute values. Then as each gap is 4 hours, (14400 seconds) use a foreach loop to go round each of the 8 age bands and see if the age is in that band.

The output is a multivalue field that contains the bands it is found it.

Then stats count by, will count for each band, so that should give you the counts in each of your bands.

Does this give you what you want

bowesmana
SplunkTrust
SplunkTrust

There are some small improvements you could make in case there are 0 results in any given bin - if there is a missing range, then the appendcols may fail to align the data correctly, so this will ensure there are the correct number of events before the transpose

| makecontinuous aString1Count start=0 end=8
| fillnull count

Do that for each search.

Also, the initial age calculation is wrong in that it's using now() - _time which should actually be the search latest time, so I show a fix for that below. In addition you want to strip out the -1 to -2 minute section which is NOT in one of the ranges (your first range is +1 to +241)

You can make it faster by making a single search rather than appendcols, which is not efficient.

There are two ways , which simply depend on how you do the chart, but I include them for a learning exercise.

index=anIndex sourcetype=aSourceType (aString1 OR aString2) earliest=-481m@m latest=-1m@m 
``` Calculate the age of the event - this is latest timne - event time ```
| addinfo
| eval age=info_max_time - _time 
``` Calculate range bands we want ```
| eval age_ranges=split("1,6,11,31,61,91,121,241",",") 
``` Not strictly necessary, but ensures clean data ```
| eval range=null()

``` This is where you set a condition "1" or "2" depending on whether
    the event is a result from aString1 or aString2 ```
| eval type=if(event_is_aString1, "A", "B")

``` Band calculation ```
| foreach 0 1 2 3 4 5 6 7 
    [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), range=mvappend(range, zone) ] 

``` So this removes the events in the pre-1 minute band ```
| where isnotnull(range)

``` Now this chart gives you 8 rows and 3 columns, first column is range, 
    2nd is counts for aString1 and 3rd for aString2```
| chart count over range by type 

``` This ensures you have values for each range ```
| makecontinuous range start=0 end=8 
| fillnull A B

``` And now create your fields on a single row ```
| eval range=range+1
| eval string1Window{range}=A, string2Window{range}=B
| stats values(string*) as string*

 

sjringo
Contributor

Thanks for all the little cleanup suggestions.  They were something I was going to get to after I got the first iteration working.
I am going to put them into my notes for later...


I have incorporated them along with getting rid of the appendcols (I was aware of a single search looking for both strings, then doing a if / case / match to determine the 'type' of event).

I was not aware of the 'fix' to the range index # but like the adding of 1 to get rid of 0 indexing.

Then, what I wanted was the % of the # of A / B, so at the end:

| eval percentage{range} = round(A / B) * 100)
| stats values(percentage*) AS percentage*

 

All tested for the first scenario.  I am working on one more enhancement where I have the same 8 'windows' but have 6 different sampling that range for 5 different 4 hour samples but using decreasing window sizes. (4Hr, 2Hr, 1.5Hr, 1Hr, 30Min)  then the last is still a 4 Hr sample but staring with a 5 min window then a 10Min, 15Min, 30Min, 60Min, 90Min, 120Min, 240Min window.

Thanks for the help and suggestions !

0 Karma

bowesmana
SplunkTrust
SplunkTrust

I am not sure I understood the additional dimension - too many numbers for my head, 8 windows, 6 samples, 5 samples... so I got lost.

However, if this helps, in the first search, the range bands were simply defined as the age and then a fixed 14400 window. If you want to change the window as well, then you can use another array, i.e.

| eval age_ranges=split("1,6,11,31,61,91,121,241",",") 
| eval window=split("1800,3600,5400,7200,14400,14400,14400,14400",",") 
...
``` Band calculation ```
| foreach 0 1 2 3 4 5 6 7 
    [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, 
           s=tonumber(mvindex(window, <<FIELD>>)),
           zone=if(age < s + r AND age > r, <<FIELD>>, null()), range=mvappend(range, zone) ] 

but again, not sure I understood the requirement

0 Karma

sjringo
Contributor

I have a follow up question.  What I am trying to figure out is how to introduce 'time' into the results so that I can create a graph showing each ranges calculated percentage and the relative time for that range.
If each range has a min and max time how would I go about showing the results as:

aRange(1), MaxTime (1), aPercentage(1)

aRange(2), MaxTime (2), aPercentage(2)

aRange(...), MaxTime (...), aPercentage(...)

aRange(8), MaxTime (8), aPercentage(8)

Here is the query im working with:

index=anIndex sourcetype=aSourcetype aString1 OR aString2 earliest=-481m@m latest=-1m@m
| addinfo
| eval age=info_max_time - _time
| eval age_ranges=split("1,6,11,31,61,91,121,241",",")
| eval aRange=null()
| eval aType = case(match(_raw,"aString1"), 0, match(_raw,"aString2"), 1, true(),9)
| foreach 0 1 2 3 4 5 6 7 [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), aRange=mvappend(aRange, zone) ]
| where isnotnull(aRange)
| chart count over aRange by aType
| eval aRange = aRange+1
| rename 0 AS A, 1 AS B
| eval aPercentage = round((A / B) * 100)
| table aRange, A,B, aPercentage

I have tried putting an eventstats before the chart count and use the indexes 0 to 7 but have not been able to get it to produce the result im looking for ?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You possibly want this

...
| chart count max(_time) as MaxTime over aRange by aType
| eval aRange = aRange+1
| rename "count: 0" AS A, "count: 1" AS B
| eval aPercentage = round((A / B) * 100)

and then you will have also two columns MaxTime: 0 and MaxTime:1 and then you can do

| eval MaxTime=max('MaxTime: 0', 'MaxTime: 1')

 

sjringo
Contributor

It does. I understand high level what its doing but will need to walk through the specifics although It does get me where I needed to be.
Here is what I ended up with:

index=anIndex sourcetype=aSourceType aString1 earliest=-481m@m latest=-1m@m
| eval age=now() - _time
| eval age_ranges=split("1,6,11,31,61,91,121,241",",")
| foreach 0 1 2 3 4 5 6 7 [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), aString1Count=mvappend(aString1Count, zone) ]
| stats count by aString1Count
| transpose 8 header_field=aString1Count
| rename 0 AS "string1Window1", 1 AS "string1Window2", 2 AS "string1Window3", 3 AS "string1Window4", 4 AS "string1Window5", 5 AS "string1Window6", 6 AS "string1Window7", 7 AS "string1Window8"
| appendcols
[search index=anIndex sourcetype=aSourceType aString2 earliest=-481m@m latest=-1m@m
| eval age=now() - _time
| eval age_ranges=split("1,6,11,31,61,91,121,241",",")
| foreach 0 1 2 3 4 5 6 7 [ eval r=tonumber(mvindex(age_ranges, <<FIELD>>))*60, zone=if(age < 14400 + r AND age > r, <<FIELD>>, null()), aString2Count=mvappend(aString2Count, zone) ]
| stats count by aString2Count
| transpose 8 header_field=aString2Count
| rename 0 AS "string2Window1", 1 AS "string2Window2", 2 AS "string2Window3", 3 AS "string2Window4", 4 AS "string2Window5", 5 AS "string2Window6", 6 AS "string2Window7", 7 AS "string2Window8" ]
| table string1Window* string2Window*

string1Window1 string1Window2 string1Window3 ...
44 42 40 ...
0 Karma
Get Updates on the Splunk Community!

Let’s Talk Terraform

If you’re beyond the first-weeks-of-a-startup stage, chances are your application’s architecture is pretty ...

Cloud Platform | Customer Change Announcement: Email Notification is Available For ...

The Notification Team is migrating our email service provider. As the rollout progresses, Splunk has enabled ...

Save the Date: GovSummit Returns Wednesday, December 11th!

Hey there, Splunk Community! Exciting news: Splunk’s GovSummit 2024 is returning to Washington, D.C. on ...