Delta time between each row - calculate frequency ...

maayan · ‎12-19-2023

Hi,

I have a table of time, machine, and total errors. I need to count for each machine how many times 3 errors (or more) happened in 5 min. if in one bucket more than 3 error happened I sign this row as True.
finally i will return the frequency of 3 errors in 5 min (Summarize all rows==True)
i succeeded in doing that in Python, but not in Splunk.

i wrote the following code :

| table TimeStamp,machine,totalErrors
| eval time = strptime(TimeStamp, "%Y-%m-%d %H:%M:%S.%3N")
| eval threshold=3
| eval time_window="5m"
| bucket span=5m time
| sort 0 machine,time
| streamstats sum(totalErrors) as cumulative_errors by machine,time
| eval Occurrence = if(cumulative_errors >= 3, "True", "False")

| table machine,TimeStamp,Occurrence

It almost correct. row 5 supposed to be True. If we calculate the delta time between row 1 to 5
more than 5 min passed, but if we calculate the delta time between row 2 to 5 less than 5 min passed
and number of errors >=3 errors.
How to change it so it will find the delta time between each row (2 to 5 , 3 to 5,.. ) for each machine ?

hope you understand. i need short and simple code because i will need to do that also for 1m,2m,.. 3,5,..errors

row

Machine

TimeStamp

Occurrence

1

machine1

12/14/2023 10:12:32

FALSE

2

machine1

12/14/2023 10:12:50

FALSE

3

machine1

12/14/2023 10:13:06

TRUE

4

machine1

12/14/2023 10:13:24

TRUE

5

machine1

12/14/2023 10:17:34

FALSE

6

machine1

12/16/2023 21:01:45

FALSE

7

machine2

12/18/2023 7:53:54

False

thanks,

Maayan

dtburrows3 · ‎12-19-2023

Not sure if this is exactly what you are looking for but I think it is pretty close.

I got this output by stringing together a couple of streamstats with window=<int> and reset_before=<criteria> parameters

| sort 0 +Machine, +time    
    | streamstats
        count as row
    | eval
        TimeStamp=strftime(time, "%m/%d/%Y %H:%M:%S")
    | fields - _time
    | fields + row, Machine, TimeStamp, time
    | streamstats window=3
        count as running_count,
        min(time) as min_time,
        max(time) as max_time
            by Machine
    | eval
        seconds_diff='time'-'min_time',
        duration_diff=tostring(seconds_diff, "duration")
    | streamstats window=3 reset_before="("seconds_diff>300")"
        count as running_count
            by Machine
    | eval
        Occurrence=if(
            'seconds_diff'<=300 AND 'running_count'==3,
                "TRUE",
                "FALSE"
            )
    | fields + row, Machine, TimeStamp, Occurrence

Here is the full SPL I used to generate the screenshot (results may vary because of the use of relative_time())

| makeresults
    | eval
        Machine="machine 1",
        time=relative_time(now(), "-2h@s")
    | append
        [
            | makeresults
                | eval
                    Machine="machine 1",
                    time=relative_time(now(), "-2h+18s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 1",
                    time=relative_time(now(), "-2h+34s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d@d+20h@h+31m@m+48s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 1",
                    time=relative_time(now(), "-2h+52s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d+5h+5m+2s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 1",
                    time=relative_time(now(), "-2h+302s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 1",
                    time=relative_time(now(), "+2d-5h+18s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d+5h+18s@s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d+5h+2m+1s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d+5h+2m+34s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d+5h+4m-12s")
            ]
    | append
        [
            | makeresults
                | eval
                    Machine="machine 2",
                    time=relative_time(now(), "+4d@d+20h@h+43m@m+5s@s")
            ]
    | sort 0 +Machine, +time    
    | streamstats
        count as row
    | eval
        TimeStamp=strftime(time, "%m/%d/%Y %H:%M:%S")
    | fields - _time
    | fields + row, Machine, TimeStamp, time
    | streamstats window=3
        count as running_count,
        min(time) as min_time,
        max(time) as max_time
            by Machine
    | eval
        seconds_diff='time'-'min_time',
        duration_diff=tostring(seconds_diff, "duration")
    | streamstats window=3 reset_before="("seconds_diff>300")"
        count as running_count
            by Machine
    | eval
        Occurrence=if(
            'seconds_diff'<=300 AND 'running_count'==3,
                "TRUE",
                "FALSE"
            )
    | fields + row, Machine, TimeStamp, Occurrence

ITWhisperer · ‎12-19-2023

It is not clear why row 5 should be true since you haven't shared the data (number of errors in each event).

Having said that, are you trying to implement a sliding 5 minute window, or are you using time bins?

If you are using time bins, the row 5 is in a different bin to rows 1-4.

Delta time between each row - calculate frequency of values in buckets

Classic dashboard

SOC4Kafka - New Kafka Connector Powered by OpenTelemetry

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Building Momentum: Splunk Developer Program at .conf25

Are you a member of the Splunk Community?

Delta time between each row - calculate frequency of values in buckets

Classic dashboard

SOC4Kafka - New Kafka Connector Powered by OpenTelemetry

Your Voice Matters! Help Us Shape the New Splunk Lantern Experience

Building Momentum: Splunk Developer Program at .conf25