Hello,
I have a monthly report that produce a table like this
Violation list
Employee | month |
A | 8-2022 |
B | 8-2022 |
I want to add a counter to count for consecutive occurrences. For example:
In September report (9-2022)
- employee A violated, his counter increase to 2
- employee B don't violate, don;t show up in report
- employee C violated, first time show up on report, his counter is 1
In October report (10-2022)
- employee A violated again, his counter increase to 3
- employee B violated again, but don't show up on September report, his counter reset to 1.
- employee C violated again, his counter increase to 2
I want the end table to look like this
Employee | month | Counter |
A | 10-2022 | 3 |
B | 10-2022 | 1 |
C | 10-2022 | 2 |
Since the report was a outputlookup csv, I don't think I can use streamstat, Can anyone suggest a way to do this.
It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day. Also, the previous attempt to survive year end was invalid. That has to be corrected, too.
Again, this is a very literal interpretation of your requirements:
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time
Employee | _time | count |
A | 2022-10 | 4 |
B | 2022-10 | 2 |
D | 2022-10 | 1 |
Here is the data emulation for the test.
| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1
As you can see, D had records in 7, 8, but was missing from 9.
Employee | month |
A | 7-2022 |
D | 7-2022 |
A | 8-2022 |
C | 8-2022 |
D | 8-2022 |
A | 9-2022 |
B | 9-2022 |
C | 9-2022 |
A | 10-2022 |
B | 10-2022 |
D | 10-2022 |
Assuming the search time is current month, here is a literal implementation of your requirements:
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(nmin, nmax + 1)
| eval nmonths = mvmap(mrange, if(mrange > nmonth, strptime(nyear - 1 . "-" . mrange . "-01", "%Y-%m-%d"), strptime(nyear . "-" . mrange . "-01", "%Y-%m-%d")))
``` ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count
Using modified sample data, the output is
Employee | _time | count |
A | 2022-09 | 3 |
B | 2022-09 | 1 |
C | 2022-09 | 2 |
(Because current month is 09, I moved all data back one month.)
Below is simulated data used in the test. To make things more interesting, I added an employee D that only occurred in the first two months but not the current month's report.
| makeresults
| eval _raw="Employee,month
A,7-2022
B,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022"
| multikv forceheader=1
Employee | month |
A | 7-2022 |
B | 7-2022 |
D | 7-2022 |
A | 8-2022 |
C | 8-2022 |
D | 8-2022 |
A | 9-2022 |
B | 9-2022 |
C | 9-2022 |
For the record, the original code (for Splunk 8 and later) will not survive year end despite a convoluted attempt. Here is the corrected code using relative_time with mvrange.
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
``` ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count
Sorry for the late reply,
My Splunk version is quite old (7.2.6) and it doesn't support mvmap. Do you have another function that could work?
It feels such a long time ago I couldn't remember how iteration was done😛 OK, mvexpand saved the day. Also, the previous attempt to survive year end was invalid. That has to be corrected, too.
Again, this is a very literal interpretation of your requirements:
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmonth = strftime(now(), "%m"), nyear = strftime(now(), "%Y")
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, nmax - nmin + 1) ``` previous definition will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time
Employee | _time | count |
A | 2022-10 | 4 |
B | 2022-10 | 2 |
D | 2022-10 | 1 |
Here is the data emulation for the test.
| makeresults
| eval _raw="Employee,month
A,7-2022
D,7-2022
A,8-2022
C,8-2022
D,8-2022
A,9-2022
B,9-2022
C,9-2022
A,10-2022
B,10-2022
D,10-2022"
| multikv forceheader=1
As you can see, D had records in 7, 8, but was missing from 9.
Employee | month |
A | 7-2022 |
D | 7-2022 |
A | 8-2022 |
C | 8-2022 |
D | 8-2022 |
A | 9-2022 |
B | 9-2022 |
C | 9-2022 |
A | 10-2022 |
B | 10-2022 |
D | 10-2022 |
Just want to note that the second effort to survive year-end was still incorrect.☹️ I must use modulus in mrange:
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous two definitions will not work across year ```
Put together,
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| mvexpand mrange
| eval rmonth = relative_time(min_month, "+".tostring(mrange)."mon") ``` this survives year boundary ```
| eval outmonth = if(isnull(mvfind(strftime(month, "%m"), strftime(rmonth, "%m"))), rmonth, null()) ``` when Employee has no record ```
| stats max(outmonth) as maxout by Employee month max_month ``` last month when Employee has no record ```
| eval maxout = if(isnull(maxout), 0, maxout)
| where month > maxout
| rename max_month as _time
| stats count by Employee _time
Or, if anyone prefers mvmap (Splunk 8 or later) over mvexpand,
| eval _time = strptime("01-".month, "%d-%m-%Y") ``` Use first day as marker ```
| stats min(_time) as min_month max(_time) as max_month values(_time) as month by Employee
| where max_month == relative_time(now(), "-0mon@mon") ``` only if Employee in current month ```
| eval nmin = strftime(min_month, "%m"), nmax = strftime(max_month, "%m")
| eval mrange = mvrange(0, (nmax - nmin + 13) % 12) ``` previous 2 definitions will not work across year ```
| eval nmonths = mvmap(mrange, relative_time(min_month, "+".tostring(mrange)."mon")) ``` also much less convoluted ```
``` ^ calculate calendar months in range ```
| eval outmonth = max(mvmap(nmonths, mvappend(outmonth, if(IN(nmonths, month), null(), nmonths)))) ``` last month when Employee had no record ```
| eval month = mvmap(month, mvappend(lastmonths, if(month < outmonth, null(), month))) ``` only count last months when Employee has record ```
| eval count = mvcount(month)
| rename max_month as _time
| table Employee _time count
Thank you, this is working as intended and fit what I need.
Hi @phamxuantung,
did you explored the "delta" command(https://www.splunk.com/en_us/blog/tips-and-tricks/search-commands-delta.html#:~:text=Like%20accum%2C.... or https://docs.splunk.com/Documentation/Splunk/9.0.1/SearchReference/Delta)? probably solves your need.
Could you share your search?
Ciao.
Giuseppe
Assuming you are replacing the contents of the csv every month with outputlookup, you could do something like this
<your report without final outputlookup>
| lookup your.csv Employee
| eval Counter=if(isnull(Counter),1,Counter+1)
| outputlookup your.csv