Splunk Search

How to add a column to count for consecutive occurrences?

phamxuantung
Communicator

Hello,

I have a monthly report that produce a table like this

Violation list

Employee month
A 8-2022
B 8-2022

 

I want to add a counter to count for consecutive occurrences. For example:

In September report (9-2022)

- employee A violated, his counter increase to 2

- employee B don't violate, don;t show up in report

- employee C violated, first time show up on report, his counter is 1

In October report (10-2022)

- employee A violated again, his counter increase to 3

- employee B violated again, but don't show up on September report, his counter reset to 1.

- employee C violated again, his counter increase to 2

I want the end table to look like this

Employee month Counter
A 10-2022 3
B 10-2022 1
C 10-2022 2

Since the report was a outputlookup csv, I don't think I can use streamstat, Can anyone suggest a way to do this.

Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day.  Also, the previous attempt to survive year end was invalid.  That has to be corrected, too.

Again, this is a very literal interpretation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 

Employee_timecount
A2022-104
B2022-102
D2022-101

Here is the data emulation for the test.

 

| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1

 

As you can see, D had records in 7, 8, but was missing from 9.

Employeemonth
A7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
A10-2022
B10-2022
D10-2022

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Assuming the search time is current month, here is a literal implementation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(nmin, nmax + 1)
| eval nmonths = mvmap(mrange, if(mrange > nmonth, strptime(nyear - 1 . "-" . mrange . "-01", "%Y-%m-%d"), strptime(nyear . "-" . mrange . "-01", "%Y-%m-%d")))
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count

 

Using modified sample data, the output is

Employee_timecount
A2022-093
B2022-091
C2022-092

(Because current month is 09, I moved all data back one month.)

Below is simulated data used in the test.  To make things more interesting, I added an employee D that only occurred in the first two months but not the current month's report. 

 

| makeresults
| eval _raw="Employee,month
A,7-2022
B,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022"
| multikv forceheader=1

 

Employeemonth
A7-2022
B7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
0 Karma

yuanliu
SplunkTrust
SplunkTrust

For the record, the original code (for Splunk 8 and later) will not survive year end despite a convoluted attempt.  Here is the corrected code using relative_time with mvrange.

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count

 

Tags (1)
0 Karma

phamxuantung
Communicator

Sorry for the late reply,

My Splunk version is quite old (7.2.6) and it doesn't support mvmap. Do you have another function that could work?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day.  Also, the previous attempt to survive year end was invalid.  That has to be corrected, too.

Again, this is a very literal interpretation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 

Employee_timecount
A2022-104
B2022-102
D2022-101

Here is the data emulation for the test.

 

| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1

 

As you can see, D had records in 7, 8, but was missing from 9.

Employeemonth
A7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
A10-2022
B10-2022
D10-2022

 

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Just want to note that the second effort to survive year-end was still incorrect.☹️ I must use modulus in mrange:

| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous two definitions will not work across year ```

Put together, 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 Or, if anyone prefers mvmap (Splunk 8 or later) over mvexpand,

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count
0 Karma

phamxuantung
Communicator

Thank you, this is working as intended and fit what I need.

0 Karma

gcusello
SplunkTrust
SplunkTrust
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming you are replacing the contents of the csv every month with outputlookup, you could do something like this

<your report without final outputlookup>
| lookup your.csv Employee
| eval Counter=if(isnull(Counter),1,Counter+1)
| outputlookup your.csv
0 Karma
Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...