Solved: Re: Count events per month until a certain day

lufermalgo · ‎09-09-2017

Hi community,

I need your help!!!

It is possible to make a report that counts the number of events grouped by month but until a certain day, that is, if the current day is 9 then the events are counted until the 9th day of each month.

Example:

_time - count
09/09/2017 - 4567
08/09/2017 - 2346
07/09/2017 - 5678
06/09/2017 - 4789
05/09/2017 - 8965
04/09/2017 - 4567
03/09/2017 - 6789
02/09/2017 - 3456
01/09/2017 - 9087

Thanks

woodcock · ‎09-09-2017

Funky. Try this:

| makeresults
| eval date_mday=strftime(now(), "%d")
| map [|tstats count where index=* BY _time span=1d
| eval date_mday=strftime(_time, "%d")
| where date_mday <= tonumber($date_mday$)
| timechart span=1mon sum(count) AS count
| fillnull count value=0
| eval _time = _time + 24 * ($date_mday$ - 1) * 60 * 60]

View solution in original post

DalJeanis · ‎09-09-2017

Try this...

your query that gets the events you care about
| rename COMMENT as "count up the events at the day level"
| bin _time span=1d
| stats count as daycount by _time

| rename COMMENT as "Figure out the offset for which day this is"
| eventstats max(_time) as maxtime
| eval maxmonth=relative_time(maxtime,"@mon")
| eval nowoffset=maxtime-maxmonth

| rename COMMENT as "If date is 28 or greater, use flat month"
| eval revisedoffset=if(nowoffset>=28*86400,0,nowoffset)
| rename COMMENT as "subtract the offset and calculate the month for reporting purposes"
| eval reportingmonth=relative_time(_time-revisedoffset,"@mon")

| rename COMMENT as "sum up all the days, use the date of the last day as the date of the report"
| stats sum(daycount) as count max(_time) as reptdate by reportingmonth

| rename COMMENT as "present the report"
| eval _time = reptdate
| table _time count

This version will keep, for example, 4/30 with 5/29

 your query that gets the events you care about
 | rename COMMENT as "count up the events at the day level"
 | bin _time span=1d
 | stats count as daycount by _time

 | rename COMMENT as "Figure out the offset for which day this is, copy it to all records and delete the calc record"
 | appendpipe  
    [| stats max(_time) as maxtime 
     | eval maxmonth=relative_time(maxtime,"@mon") 
     | eval nowoffset=maxtime-maxmonth 
     ]
 | eventstats max(nowoffset) as nowoffset
 | where isnull(maxmonth)

 | rename COMMENT as "For this event, figure out the endpoints "
 | eval calcmonth0=relative_time(_time,"@mon")
 | eval calcmonth1=relative_time(_time,"+1mon@mon")
 | eval calcmonth2=relative_time(_time,"+2mon@mon")
 | eval endmonth0=if(calcmonth0+nowoffset>=calcmonth1,relative_time(calcmonth1,"-1d"),calcmonth0+nowoffset)
 | eval endmonth1=if(calcmonth1+nowoffset>=calcmonth2,relative_time(calcmonth2,"-1d"),calcmonth1+nowoffset)
 | eval reportingmonth=if(_time>endmonth0,endmonth1,endmonth0)


 | rename COMMENT as "sum up all the days, use the date of the last day as the date of the report"
 | stats sum(daycount) as count max(_time) as reptdate min(_time) as reptstartdate by reportingmonth

 | rename COMMENT as "present the report"
 | eval _time = reptdate
 | eval reptrange=strftime(reptstartdate,"%Y-%m-%d")." to ".strftime(reptdate,"%Y-%m-%d") 
 | table _time reptrange count

lufermalgo · ‎09-09-2017

hi @DalJeanis

Thanks a lot for your help, your solution seems to work, however, I have a problem with my data, they are duplicated and I need to clean them up to more than 10 million, so the search takes a lot.

DalJeanis · ‎09-09-2017

@lufermalgo - You are welcome. Under those circumstances, if this is not going to be just a one-time report, I would suggest that you consider creating a summary index to hold your daily figures.

This report will run quickly against the summary data, just like a tstats would.

If the duplicates are always created the same day, then your daily summary creation program can weed them out. If they are occasionally created later, then you can (for instance) resummarize your daily records a week later, and always pick the latest summary record for any given date. Again, this current report would run quickly against the summary index.

woodcock · ‎09-09-2017

needs more tstats...

DalJeanis · ‎09-09-2017

@woodcock - Probably true. Everything down to the first stats command can be replaced by the equivalent tstats command if the data is there.

woodcock · ‎09-09-2017

Funky. Try this:

| makeresults
| eval date_mday=strftime(now(), "%d")
| map [|tstats count where index=* BY _time span=1d
| eval date_mday=strftime(_time, "%d")
| where date_mday <= tonumber($date_mday$)
| timechart span=1mon sum(count) AS count
| fillnull count value=0
| eval _time = _time + 24 * ($date_mday$ - 1) * 60 * 60]

lufermalgo · ‎09-09-2017

Hi @woodcock

Thank you very much for your help.
As my data is duplicated you should do a cleanup before, it generates some questions about how to do it then use tstat or create a data model, use tstat and then a dedup ...

| makeresults 
| eval date_mday=strftime(now(), "%d") 
| map 
    [| tstats summariesonly=t allow_old_summaries=t count from datamodel="eventsPDN" where (nodename="allEventsPDN") by "allEventsPDN.CDCLAVE" "allEventsPDN.VALOR" _time span=1d 
    | stats count by _time "allEventsPDN.CDCLAVE" "allEventsPDN.VALOR" 
    | eval date_mday=strftime(_time, "%d") 
    | where date_mday <= tonumber($date_mday$) 
    | timechart span=1mon sum(count) AS count 
    | fillnull count value=0 
    | eval _time = _time + 24 * ($date_mday$ - 1) * 60 * 60]

It took 500 seconds to give the result ...

woodcock · ‎09-09-2017

I completely do not understand your comment. Does it work or not? Why are you doing a tstats followed by a stats? Why do you think that I might speak Spanish (hablo poquito, pero....)?

lufermalgo · ‎09-09-2017

Create a data model called "eventsPDN", however, there is duplicate data. I have to leave the events that match the CDCLAVE and VALUE field before counting ...

Count events per month until a certain day

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?

Count events per month until a certain day

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25