Given this excerpt from log files I generate and index:
2016-11-19 20:34:21 GMT vehicle_id="1009" route="E" speed=0 distance=136 stop_tag="4502"
2016-11-19 20:36:44 GMT vehicle_id="1009" route="E" speed=13 distance=4 stop_tag="4529"
2016-11-19 19:46:23 GMT vehicle_id="1006" route="E" speed=21 distance=140 stop_tag="7795"
2016-11-19 20:18:10 GMT vehicle_id="1007" route="E" speed=9 distance=42 stop_tag="5240"
2016-11-19 20:38:28 GMT vehicle_id="1009" route="E" speed=21 distance=281 stop_tag="4516"
you'll notice that the time-stamps are out-of-order. The way these are generated is that a web site is polled and it returns the set of vehicles that have changed since the last poll. Included also (but not shown here) is how long ago the vehicle "phoned home" to the web site. When I generate a log file entry, I subtract the "ago" value from "now" to get the time at which the vehicle actually transmitted the data and use that time for the time-stamp. Hence the time-stamps are out-of-order.
Whenever I want to write searches against this data, do I have to do anything special? For example, if I use streamstats
as part of a search, does Splunk compensate for the out-of-order events and return them in correct time-order? Or do I always have to include sort _time
as part of the search pipeline prior to invoking streamstats
?
Do I have to do something else to compensate for other search commands?
Note to aaraneta_splunk: This question is NOT only about streamstats
. I used streamstats
as an example only. Please do not edit the question again. Thanks.
You don't have to do anything beyond making sure Splunk recognizes the timestamps.
You don't have to do anything beyond making sure Splunk recognizes the timestamps.
Two reasons:
See http://docs.splunk.com/Documentation/Splunk/6.5.1/SearchReference/sort for reference.
Then you'd think that by adding a | sort _time |
, it wouldn't change the results --- but it does. Thoughts as to why?