I'm trying to calculate volume growth by comparing the values of subsequent events from the df sourcetype. To get the current and previous values, I'm using eventstats like so:
index=os sourcetype=df host="HOST_NAME" | multikv | search MountedOn="VOLUME" | convert auto(UsePct) | streamstats current=f window=1 first(UsePct) as prevUsePct | table _time MountedOn prevUsePct UsePct
When I do this, the 'prevUsePct' value appears to be the UsePct value from the next record, not the previous one. So my output looks like this:
_time MountedOn prevUsePct UsePct 2013-10-10T09:10:14.000-0500 /data/vol_253 58 58 2013-10-10T09:05:14.000-0500 /data/vol_253 58 58 2013-10-10T09:00:14.000-0500 /data/vol_253 58 57 2013-10-10T08:55:14.000-0500 /data/vol_253 57 57 2013-10-10T08:50:15.000-0500 /data/vol_253 57 57
While I would expect to see something like this:
_time MountedOn prevUsePct UsePct
2013-10-10T09:10:14.000-0500 /data/vol_253 58 58
2013-10-10T09:05:14.000-0500 /data/vol_253 57 58
2013-10-10T09:00:14.000-0500 /data/vol_253 57 57
2013-10-10T08:55:14.000-0500 /data/vol_253 57 57
2013-10-10T08:50:15.000-0500 /data/vol_253 57 57
Hopefully this illustrates my concern. It appears that streamstats starts with the current record and looks ahead, not back as the documentation indicates.
Is this a bug, or am I misunderstanding the command?
Just sort your data the opposite way before running stream stats.
index=os sourcetype=df host="HOST_NAME" | multikv | search MountedOn="VOLUME" | convert auto(UsePct) | sort _time | streamstats current=f window=1 first(UsePct) as prevUsePct | table _time MountedOn prevUsePct UsePct
You might even be able to do
index=os sourcetype=df host="HOST_NAME" | multikv | search MountedOn="VOLUME" | convert auto(UsePct)| reverse | streamstats current=f window=1 first(UsePct) as prevUsePct | table _time MountedOn prevUsePct UsePct
I downvoted this post because ardhyurszehz tfjuxt
the first()
function means the first one that is encountered, not the first one in time. the function you really want is earliest()
. but there's more, you can't just swap it out. since Splunk returns events in reverse time order, you're of course seeing the opposite of what you want. I don't recommend using reverse
since it could mean you have to re-sort a very large data set before doing anything else. Instead, you should just include your current event, increase the window size to actually include the "next" (i.e., earlier) event, and reference that with the earliest()
function:
... | streamstats current=t window=2 earliest(UsePct) as prevUsePct | ...
ah, i see. yes, the window on streamstats is backwards. it is a "trailing" window, which means it covers the current events and events seen "before", i.e., events that are later in time. so with current=t
, last()
will always refer to the current event. unfortunately (and this is a common use case), this means that what you want to do needs to be done with reverse
, or else with something like:
... | streamstats current=t window=2 latest(_time) as time_new latest(MountedOn) as MountedOn_new latest(UsePct) as UsePct_new earliest(UsePct) as prevUsePct | ...
except that kind of sucks.
Right, I'm aware of that. It's just something that's not intuitive to people when they first encounter it. I'm wondering if streamstats is using the same 'newest to oldest' logic. Also, the suggestion is something I've already tried, and it still behaves as seen above. If you include current, it gives you the current value of the event. If you don't include current, you get the value for the next event.
Just sort your data the opposite way before running stream stats.
index=os sourcetype=df host="HOST_NAME" | multikv | search MountedOn="VOLUME" | convert auto(UsePct) | sort _time | streamstats current=f window=1 first(UsePct) as prevUsePct | table _time MountedOn prevUsePct UsePct
You might even be able to do
index=os sourcetype=df host="HOST_NAME" | multikv | search MountedOn="VOLUME" | convert auto(UsePct)| reverse | streamstats current=f window=1 first(UsePct) as prevUsePct | table _time MountedOn prevUsePct UsePct
| reverse | does work! Although seems like the command is working differently than described. (It could be similar to the stats functions like first() last() which also operate counterintuitively)