Solved: Suggestions for charting backlogs by month?

aamirs291 · ‎08-21-2017

Everyone,

I would like to know of suggestions for charting backlogs by month.

So a backlog in my scenario are tickets which have, for example, a "start_time" < August 1st, 2017 AND resolve_time > August 31st, 2017 i.e. tickets which were started being worked on before August 1st but didn't resolve even after August 31st.
This would need to be charted for each month; starting from Nov 1st, 2012 until now.

For now, I am trying the following code :

index="my_index"
| eval time_sub = strptime(start_time,"%d/%m/%y %H:%M:%S")
| eval time_res = strptime(resolve_time,"%d/%m/%y %H:%M:%S")
| eval Backlog_check = if((time_sub < 1501545600 AND time_res > 1504223999),"Backlog","Resolved")

| stats count by Backlog_check

The problem with this approach is that I am only getting the counts for the month of August 2017. I could replace 1501545600 to be substituted by a token value from the time range picker eg. $time.earliest$ and for 1504223999 by $time.latest$ and make this more dynamic but then the search would run only for the values passed down by the time range picker tokens which is not what I completely need.

A clean approach to resolve this would be helpful.

DalJeanis · ‎08-21-2017

Hmmm. This exact scenario has been handled a couple of times. I'll see if I can find the writeup...

Update - one writeup is here - https://answers.splunk.com/answers/513002/how-to-graph-sum-of-overlapping-values-given-start.html#an...

...but here's the general method.

This writeup down to the next bar is to know how many tickets are open on a given day, but it can be easily adapted to your needs.

1) Select ALL events with a null close date or a close date greater than your start date.
2) Turn each event with a null close date into a single event, as/of the opendate, with a +1 opencount value.

3) Turn each event with a valid close date into two events, an open event as above, plus a close event as/of the close data, with a -1 opencount value.

4) | stats sum(opencount) as opencount by _time
5) | streamstats sum(opencount) as opencount
6) Kill any dates prior to the start of your time period, because you have only accounted for events still open at the beginning of your time period, so those dates have deflated numbers from their actuals.

That gives you a graph of how many events were in open status at the end of each day in question.

Now, here's how to adapt it.

1) Select ALL events with a null close date or a close date greater than your start date.
2) Calculate relative_time +2mon@mon from start date.
3) If the event was closed before that date, discard the event.
4) If that date has not yet been reached, discard the event.
5) Calculate relative_time +1mon@mon from start date. This is the new start-date, for purposes of making the graph.

6) Calculate the relative_time @mon from close_date. This is the new close date, for purposes of making the graph.
7) Proceed as above.

Update - the above logic was verified with run-anywhere sample code below.

Run-anywhere demo code.

| makeresults  
| eval mydates="2017-02-05,2017-03-15 2017-02-15,2017-04-15 2017-02-12,2017-04-01 2017-02-05,NULL 2017-03-18,2017-03-30 2017-02-05,2017-02-15 2017-07-05,NULL" 
| makemv mydates 
| mvexpand mydates 
| rex field=mydates "(?<StartDate>[^,]*),(?<EndDate>.*)"
| eval StartDate=strptime(StartDate,"%Y-%m-%d")
| eval EndDate=strptime(EndDate,"%Y-%m-%d")
| rename COMMENT as "The above generates sample data."

| rename COMMENT as "Adaptation Steps 2-4."
| eval MightBeLateDate = relative_time(StartDate,"+2mon@mon")
| where (MightBeLateDate<=EndDate) OR (isnull(EndDate) AND MightBeLateDate<=now()) 

| rename COMMENT as "Adaptation Steps 5-6."
| eval GraphStartDate=relative_time(StartDate,"+1mon@mon") 
| eval GraphEndDate=relative_time(EndDate,"@mon") 

| rename COMMENT as "Original Steps 2-3."
| eval myfan=mvappend("first","second")
| mvexpand myfan
| eval _time = if(myfan="first",GraphStartDate,GraphEndDate)
| eval OverDueCount = if(myfan="first",+1,-1)
| where isnotnull(_time)

| rename COMMENT as "UPDATED:  Add one zero record per month if your use case has sparse data."
| appendpipe 
    [ | stats min(_time) as mintime 
      | eval thismonth=relative_time(now(),"@mon") 
      | eval numMonths=ceiling((thismonth-mintime)/86400/30) 
      | eval whichmonth=mvrange(0,numMonths) 
      | mvexpand whichmonth 
      | eval _time=relative_time(mintime+whichmonth*2635200+864000,"@mon")  
      | eval OverDueCount=0 
      | where _time<relative_time(now(),"@mon")
      ]
| table _time OverDueCount

| rename COMMENT as "Original Steps 4-5."
| stats sum(OverDueCount) as OverDueNet by _time 
| streamstats sum(OverDueNet) as OverDueCount

| rename COMMENT as "Original Step 6 goes here - eliminate any dates not fully calculated."
| rename COMMENT as "    | where _time>= some calculation here  AND _time <= some other calculation"

| rename COMMENT as "Now chart the results."
| timechart span=1mon max(OverDueCount) as OverDueCount

View solution in original post

DalJeanis · ‎08-21-2017

Hmmm. This exact scenario has been handled a couple of times. I'll see if I can find the writeup...

Update - one writeup is here - https://answers.splunk.com/answers/513002/how-to-graph-sum-of-overlapping-values-given-start.html#an...

...but here's the general method.

This writeup down to the next bar is to know how many tickets are open on a given day, but it can be easily adapted to your needs.

1) Select ALL events with a null close date or a close date greater than your start date.
2) Turn each event with a null close date into a single event, as/of the opendate, with a +1 opencount value.

3) Turn each event with a valid close date into two events, an open event as above, plus a close event as/of the close data, with a -1 opencount value.

4) | stats sum(opencount) as opencount by _time
5) | streamstats sum(opencount) as opencount
6) Kill any dates prior to the start of your time period, because you have only accounted for events still open at the beginning of your time period, so those dates have deflated numbers from their actuals.

That gives you a graph of how many events were in open status at the end of each day in question.

Now, here's how to adapt it.

1) Select ALL events with a null close date or a close date greater than your start date.
2) Calculate relative_time +2mon@mon from start date.
3) If the event was closed before that date, discard the event.
4) If that date has not yet been reached, discard the event.
5) Calculate relative_time +1mon@mon from start date. This is the new start-date, for purposes of making the graph.

6) Calculate the relative_time @mon from close_date. This is the new close date, for purposes of making the graph.
7) Proceed as above.

Update - the above logic was verified with run-anywhere sample code below.

Run-anywhere demo code.

| makeresults  
| eval mydates="2017-02-05,2017-03-15 2017-02-15,2017-04-15 2017-02-12,2017-04-01 2017-02-05,NULL 2017-03-18,2017-03-30 2017-02-05,2017-02-15 2017-07-05,NULL" 
| makemv mydates 
| mvexpand mydates 
| rex field=mydates "(?<StartDate>[^,]*),(?<EndDate>.*)"
| eval StartDate=strptime(StartDate,"%Y-%m-%d")
| eval EndDate=strptime(EndDate,"%Y-%m-%d")
| rename COMMENT as "The above generates sample data."

| rename COMMENT as "Adaptation Steps 2-4."
| eval MightBeLateDate = relative_time(StartDate,"+2mon@mon")
| where (MightBeLateDate<=EndDate) OR (isnull(EndDate) AND MightBeLateDate<=now()) 

| rename COMMENT as "Adaptation Steps 5-6."
| eval GraphStartDate=relative_time(StartDate,"+1mon@mon") 
| eval GraphEndDate=relative_time(EndDate,"@mon") 

| rename COMMENT as "Original Steps 2-3."
| eval myfan=mvappend("first","second")
| mvexpand myfan
| eval _time = if(myfan="first",GraphStartDate,GraphEndDate)
| eval OverDueCount = if(myfan="first",+1,-1)
| where isnotnull(_time)

| rename COMMENT as "UPDATED:  Add one zero record per month if your use case has sparse data."
| appendpipe 
    [ | stats min(_time) as mintime 
      | eval thismonth=relative_time(now(),"@mon") 
      | eval numMonths=ceiling((thismonth-mintime)/86400/30) 
      | eval whichmonth=mvrange(0,numMonths) 
      | mvexpand whichmonth 
      | eval _time=relative_time(mintime+whichmonth*2635200+864000,"@mon")  
      | eval OverDueCount=0 
      | where _time<relative_time(now(),"@mon")
      ]
| table _time OverDueCount

| rename COMMENT as "Original Steps 4-5."
| stats sum(OverDueCount) as OverDueNet by _time 
| streamstats sum(OverDueNet) as OverDueCount

| rename COMMENT as "Original Step 6 goes here - eliminate any dates not fully calculated."
| rename COMMENT as "    | where _time>= some calculation here  AND _time <= some other calculation"

| rename COMMENT as "Now chart the results."
| timechart span=1mon max(OverDueCount) as OverDueCount

aamirs291 · ‎08-23-2017

Thank you for the response DalJeanis .

I did some modifications to line 11 as follows :
| eval MightBeLateDate = relative_time(StartDate,"+1mon@mon")

+1mon@mon was to define the threshold for late resolution of a ticket to be the start of the next month.

Also to line 15 as follows :
| eval GraphStartDate=relative_time(StartDate,"@mon")

@mon was to define the graphing month to be the month the ticket was started to be worked on.

I had one question though; for the dates you have given as examples the backlogs should have been ;
For February , 4 backlogs
1st ; Start = 5th Feb and End = 15th March
2nd; Start = 15th Feb and End = 15th April
3rd; Start = 12th Feb and End = 1st April
4th; Start = 5th Feb and End = NULL

For March; 3 backlogs
1st from the above would not be a backlog since it got resolved in March.
2nd; Start = 15th Feb and End = 15th April
3rd; Start = 12th Feb and End = 1st April
4th; Start = 5th Feb and End = NULL

For April; 1 backlog
2nd and 3rd from the above would get dropped off since they resolved in April,
4th; Start = 5th Feb and End = NULL

For May and June; 1 Backlog for each month. (This isn't getting charted in the graph though. Can you suggest what I could do to chart the one backlog which must be here ?)
4th; Start = 5th Feb and End = NULL [ This ticket started in Feb and until May and June this ticket didnt close; this must be a Backlog ]

For July ;
4th; Start = 5th Feb and End = NULL
5th; Start = 5th July and End = NULL

Are the changes I have done correct ? Because in the code which you have written the backlogs are being charted from March(3) and April(1) which is correct but Feb and July weren't .
Additionally could you let me know how I could chart the Backlog from Feb (not closed ticket) in the months of May and June ?

DalJeanis · ‎08-24-2017

@aamirs291 - The code automatically takes care of the open tickets like February/Null, because line 23 kills the record that would have decremented the counter. The summary numbers from Apr-July were invisible because of the unrealistic situation that there were no tickets either opened or closed in those months, so no roll-up records were created.

This code would have worked for you in real life, but we have gone ahead and updated the code so that any people who later need to adapt the code to sparse use cases won't run into this situation.

I believe that the above oversight may have confused you on the other two points.

With regard to February, your original specifications said that tickets would count as backlog in August only if they were opened prior to August 1 and remained open after August 31. As such, since no tickets were opened prior to February 1, there should be no backlog in February. Correct?

Your changes to MightBeLateDate could be appropriate if tickets would be late on March 1 when opened in February, as per your revised specs. By the original specs, it is not possible for a ticket opened Feb 1 to be backlogged before Apr 1... at which point it counts as backlogged for the month of March.

The revised code meets the original specs. If you'd like to change the specs, just let us know, and we can provide some adjusted code.

aamirs291 · ‎08-25-2017

Yes DalJeanis, you are right about the previous code working in real life.

And yes during the course of these responses I also realized that a ticket, if opened in February and not closed in February itself, should also qualify as a Backlog for that month; something I missed in the original question.

So there are two scenarios for a ticket to be backlog:
1) Backlogs from previous months. Eg: a ticket opened in January and with no resolve date is a backlog for all the months following January,

2) Backlog when a ticket is opened in Feb for example and doesn't resolve in the same month.

I feel that both these scenarios have been looked into now.

I think the GraphStartDate's month is a matter of personal preference.

Thank you for following up on this DalJeanis.

DalJeanis · ‎08-25-2017

@aamirs291 - no, not preference: specifications.

Now that you have corrected the specifications, then your changes to MightBeLateDate and GraphStartDate, lines 11 and 15, are REQUIRED, not optional. '

I'm going to leave the original code as is, since it matches the original specs and changing it might confuse people more than required.

The following code has a couple of minor tweaks due to the new specs. Since anything opened in the prior month is already backlogged for the current month, the calculation of backlog can go one month further forward up to he current month.

Also, I've set up a spot where you can change the GraceDays parameter if, for example, an event opened won't count as backlogged until 7 days have passed.

Final code -

 | makeresults  
 | eval mydates="2017-02-05,2017-03-15 2017-02-25,2017-04-15 2017-02-12,2017-04-01 2017-02-05,NULL 2017-03-18,2017-03-30 2017-02-05,2017-02-15 2017-07-05,NULL" 
 | makemv mydates 
 | mvexpand mydates 
 | rex field=mydates "(?<StartDate>[^,]*),(?<EndDate>.*)"
 | eval StartDate=strptime(StartDate,"%Y-%m-%d")
 | eval EndDate=strptime(EndDate,"%Y-%m-%d")
 | rename COMMENT as "The above generates sample data."
 | rename COMMENT as "Adaptation Steps 2-4."
 | rename COMMENT as "UPDATED:  Backlog begins at the first monthly boundary after some number of GraceDays have passed."
 | rename COMMENT as "UPDATED:  Change the 0 below to how many days before the first monthend the event must have already started to count as backlog."
 | eval GraceDays = 0*86400 
 | eval MightBeLateDate = relative_time(StartDate,"+1mon@mon")
 | where (MightBeLateDate+GraceDays<=EndDate) OR (isnull(EndDate) AND MightBeLateDate+GraceDays<=now()) 
 | rename COMMENT as "Adaptation Steps 5-6."
 | rename COMMENT as "UPDATED: Backlog begins at the first monthly boundary if GraceDays have passed, turn off nulls at +2mon."
 | eval GraphStartDate=relative_time(StartDate,"@mon") 
 | eval GraphEndDate=relative_time(  coalesce(EndDate,relative_time(now(),"+2mon")),"@mon")       
 | rename COMMENT as "Original Steps 2-3."
 | eval myfan=mvappend("first","second")
 | mvexpand myfan
 | eval _time = if(myfan="first",GraphStartDate,GraphEndDate)
 | eval OverDueCount = if(myfan="first",+1,-1)
 | rename COMMENT as "UPDATED:  Add one zero record per month if your use case has sparse data."
 | appendpipe 
     [ | stats min(_time) as mintime 
       | eval thismonth=relative_time(now(),"@mon") 
       | eval numMonths=ceiling((thismonth-mintime)/86400/30) 
       | eval whichmonth=mvrange(0,numMonths) 
       | mvexpand whichmonth 
       | eval _time=relative_time(mintime+whichmonth*2635200+864000,"@mon")  
       | eval OverDueCount=0 
       ]
 | table _time OverDueCount
 | rename COMMENT as "Original Steps 4-5."
 | stats sum(OverDueCount) as OverDueNet by _time 
 | streamstats sum(OverDueNet) as OverDueCount
 | rename COMMENT as "Original Step 6 goes here - eliminate any dates not fully calculated."
 | rename COMMENT as "    | where _time>= some calculation here  AND _time <= some other calculation"
 | rename COMMENT as "Now chart the results."
 | timechart span=1mon max(OverDueCount) as OverDueCount
 | where _time<relative_time(now(),"+1mon@mon")

aamirs291 · ‎08-28-2017

Thank you for correcting mistakes and for the help DalJeanis.

aamirs291 · ‎10-04-2017

DalJeanis,

Could you suggest of a way in which I could split-by a field in the data for the backlog ? For example for a given month a ticket qualified for Backlog having a Ticket type of "Incident" and another ticket with a ticket type of "Service Request".
It could also be the case that for the next month there was neither Incident or Service Request for backlog. So streamstats the way it has been applied now isn't giving me the correct results for example if Jan had 2 backlog (1 for each Ticket Type) and then Feb had none and March had 2; the 2 from Jan are getting added to March but Feb should also have had 2.
Maybe I would need to have null value fields for the Ticket Type of Service Request and Incident for each month even if the backlog count was 0 so that streamstats could work?

Let me know if this needs to be posted as a new question and I would then post it as such.

Suggestions for charting backlogs by month?

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms