I discovered a few more edge cases which my previous search query couldn't handle: If a case was created and its status was changed from new to in_progress before the next REST API poll then that would be the first event in Splunk and, because of its status, it would be filtered out and wouldn't be counted. If a case's creation event was missed, then a user being assigned to the case would be seen as the first event and, because the the modification _time was being used, it would appear in the wrong place on the sparkline (history / trend graph). If a case was closed then its status was changed to a different resolution type then the closure would be double counted. I resolved these with the following new search query: index="<client>"
| dedup case_id, status sortby -_time
| reverse
| eval openClosed = case(match(status, "new|in_progress"), "Open", match(status, "resolved_other|resolved_auto"), "Closed")
| streamstats current=f last(status) as lastStatus by case_id
| eval firstEvent = if(openClosed == "Open" AND isnull(lastStatus), "True", "False")
| eval lastEvent = if(openClosed == "Closed", "True", "False")
| where firstEvent == "True" OR lastEvent == "True"
| dedup case_id, openClosed sortby -_time
| append [| makeresults | eval _time = 0, alert_count = 10000, openClosed = "Closed"]
| eval _time = if(status == "new", strptime(creation_time, "%Y/%m/%d %H:%M:%S"), _time)
| sort +_time
| eval alert_count_change = case(openClosed == "Open", "+" . alert_count, openClosed == "Closed", "-" . alert_count)
| streamstats sum(alert_count_change) as alert_count_open_total
| streamstats min(alert_count_open_total) as alert_count_open_min
| eval alert_count_open_total_nonegatives = case(alert_count_open_total == alert_count_open_min, 0, alert_count_open_total < 0 AND alert_count_open_total > alert_count_open_min, alert_count_open_total - alert_count_open_min, alert_count_open_total > 0 AND alert_count_open_total < alert_count_change, alert_count_change, 1=1, alert_count_open_total)
| table _time, alert_count_open_total_nonegatives, alert_count_open_total, alert_count_change, alert_count, case_id, status, lastStatus, openClosed, firstEvent, lastEvent Final explanation (hopefully): | dedup case_id, status sortby -_time This gets the latest event per status for each case. This is useful in some cases such as: An existing case is updated with an increased alert count but the status doesn't change from new. A case's assigned user is changed but the status doesn't change from in_progress. | reverse By default, the events are ordered by newest first but streamstats is order-sensitive so we reverse to get the oldest events first otherwise our latest / current value would actually be the oldest. | eval openClosed = case(match(status, "new|in_progress"), "Open", match(status, "resolved_|etc"), "Closed") As there are various types of open and closed statuses, this consolidates all of them into simple values for use in upcoming evaluations. | streamstats current=f last(status) as lastStatus by case_id This looks at the previous event and stores its status which we'll need for determining whether an event is a "middle" one (not the case creation or closure). | eval firstEvent = if(openClosed == "Open" AND isnull(lastStatus), "True", "False") | eval lastEvent = if(openClosed == "Closed", "True", "False") | where firstEvent == "True" OR lastEvent == "True" This filters out any of the aforementioned "middle" events, leaving us with the case creation and closure events of which we'll need their alert counts for the sparkline sum. | dedup case_id, openClosed sortby -_time Sometimes there are multiple closure events for the same case if, for example, a resolved status was changed to a different type. For those scenarios, this gets the last instance of Closed = the last closure event. | append [| makeresults | eval _time = 0, alert_count = 10000, openClosed = "Closed"] A later case evaluation that prevents a count from going negative has a problem where it can't handle a value of 0 so, to resolve that, we need to keep the value away from 0 by creating a negative baseline. To do that, we need to prefix a fake event with (1) an extreme alert count which a later evaluation will convert to negative due to it being closed and (2) a timestamp which is guaranteed to be the first / earliest event. | eval _time = if(status == "new", strptime(creation_time, "%Y/%m/%d %H:%M:%S"), _time) In our scenario, each event contains two time fields: creation_time and modification_time, _time being parsed from the latter. So, if the status is new then we need to change the value of _time to a Unix epoch version of creation_time which resolves newly-discovered edge case #2. | sort +_time Earlier, we used reverse which put all of the events in the right order but we've since done a few different things which have messed up the timeline so we need to order from oldest to newest events again to ensure that the sparkline is correct. | eval alert_count_change = case(openClosed == "Open", "+" . alert_count, openClosed == "Closed", "-" . alert_count) Open and closed cases have the exact same number of alerts because it's only the status that changes but we want the closed cases' alert counts to be subtracted from the open figure and vice versa. Fortunately, a plus-minus equals a minus (something I haven't even thought of since high school which I'm now thankful for) so all we have to do is prefix a minus symbol and the sum command takes care of the rest. Prefixing the plus symbol isn't necessary but it keeps things easy to read and allows you to search for explicitly positive values. Also, even though it looks like a string concatenation, the result is actually a number type, as verified by testing with typeof(). | streamstats sum(alert_count_change) as alert_count_open_total This sums the (positive and negative) values of the column / fields alert_count_change line by line. | streamstats min(alert_count_open_total) as alert_count_open_min | eval alert_count_open_total_nonegatives = case(alert_count_open_total == alert_count_open_min, 0, alert_count_open_total < 0 AND alert_count_open_total > alert_count_open_min, alert_count_open_total - alert_count_open_min, alert_count_open_total > 0 AND alert_count_open_total < alert_count_change, alert_count_change, 1=1, alert_count_open_total) Sometimes we don't have the events for case creations because they were already closed when the case data was retrieved which results in negative numbers but we can't have negative cases open so, as mentioned earlier, this prevents the value from going negative. Again, big thank you to @ALXWBR for asking this question and to @DavidHourani for answering it at https://community.splunk.com/t5/Splunk-Search/Streamstats-sum-that-doesn-t-go-below-zero/td-p/454371 | table _time, alert_count_open_total_nonegatives, alert_count_open_total, alert_count_change, alert_count, case_id, status, lastStatus, openClosed, firstEvent, lastEvent A table is required for statistics, the first value from the left is used for the X axis so we use _time, and the second value is used for the Y axis so we use alert_count_open_total_nonegatives. Everything else is just there in case we need to diagnose a problem or export the data. If per-severity trends are desired then simply append severity_count_field > 0 to the first search term. Just be aware that this radically changes the sparklines as you can see below:
... View more