Splunk Search

How to add a column to count for consecutive occurrences?

phamxuantung
Communicator

Hello,

I have a monthly report that produce a table like this

Violation list

Employee month
A 8-2022
B 8-2022

 

I want to add a counter to count for consecutive occurrences. For example:

In September report (9-2022)

- employee A violated, his counter increase to 2

- employee B don't violate, don;t show up in report

- employee C violated, first time show up on report, his counter is 1

In October report (10-2022)

- employee A violated again, his counter increase to 3

- employee B violated again, but don't show up on September report, his counter reset to 1.

- employee C violated again, his counter increase to 2

I want the end table to look like this

Employee month Counter
A 10-2022 3
B 10-2022 1
C 10-2022 2

Since the report was a outputlookup csv, I don't think I can use streamstat, Can anyone suggest a way to do this.

Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day.  Also, the previous attempt to survive year end was invalid.  That has to be corrected, too.

Again, this is a very literal interpretation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 

Employee_timecount
A2022-104
B2022-102
D2022-101

Here is the data emulation for the test.

 

| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1

 

As you can see, D had records in 7, 8, but was missing from 9.

Employeemonth
A7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
A10-2022
B10-2022
D10-2022

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Assuming the search time is current month, here is a literal implementation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(nmin, nmax + 1)
| eval nmonths = mvmap(mrange, if(mrange > nmonth, strptime(nyear - 1 . "-" . mrange . "-01", "%Y-%m-%d"), strptime(nyear . "-" . mrange . "-01", "%Y-%m-%d")))
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count

 

Using modified sample data, the output is

Employee_timecount
A2022-093
B2022-091
C2022-092

(Because current month is 09, I moved all data back one month.)

Below is simulated data used in the test.  To make things more interesting, I added an employee D that only occurred in the first two months but not the current month's report. 

 

| makeresults
| eval _raw="Employee,month
A,7-2022
B,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022"
| multikv forceheader=1

 

Employeemonth
A7-2022
B7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
0 Karma

yuanliu
SplunkTrust
SplunkTrust

For the record, the original code (for Splunk 8 and later) will not survive year end despite a convoluted attempt.  Here is the corrected code using relative_time with mvrange.

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count

 

Tags (1)
0 Karma

phamxuantung
Communicator

Sorry for the late reply,

My Splunk version is quite old (7.2.6) and it doesn't support mvmap. Do you have another function that could work?

0 Karma

yuanliu
SplunkTrust
SplunkTrust

It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day.  Also, the previous attempt to survive year end was invalid.  That has to be corrected, too.

Again, this is a very literal interpretation of your requirements:

 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 

Employee_timecount
A2022-104
B2022-102
D2022-101

Here is the data emulation for the test.

 

| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1

 

As you can see, D had records in 7, 8, but was missing from 9.

Employeemonth
A7-2022
D7-2022
A8-2022
C8-2022
D8-2022
A9-2022
B9-2022
C9-2022
A10-2022
B10-2022
D10-2022

 

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

Just want to note that the second effort to survive year-end was still incorrect.☹️ I must use modulus in mrange:

| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous two definitions will not work across year ```

Put together, 

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time

 Or, if anyone prefers mvmap (Splunk 8 or later) over mvexpand,

| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
```    ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count
0 Karma

phamxuantung
Communicator

Thank you, this is working as intended and fit what I need.

0 Karma

gcusello
SplunkTrust
SplunkTrust
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming you are replacing the contents of the csv every month with outputlookup, you could do something like this

<your report without final outputlookup>
| lookup your.csv Employee
| eval Counter=if(isnull(Counter),1,Counter+1)
| outputlookup your.csv
0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...