I have a customer who is using streamstats to validate data is coming into Splunk. I recommended tstats, and do a count by index/hostname.... Is one approach better than the other? We want to validate that data is coming in a consistent manner, based upon event counts.
Streamstats is for generating cumulative aggregation on the result and not sure how it was useful to check data is coming to Splunk. The tstats command run on txidx files (metadata) and is lighting faster. So, as long as your check to validate data is coming or not, involves metadata fields or indexed fields, tstats would be the way to go. If you can share the search that customer is using with streamstats, then we can say for sure if tstats can replace that.
Ohh yeah.. You can use tstats for this. Like this
| tstats count WHERE index=euc_network90 sourcetype=era_full_syslog host=myhost by _time span=1d | accum count
Not sure if the streamstats was used correctly there.
Right, I use tstats. Trying to explain the different to my customer and why their search isn't correct and what is it actually reporting. Not quite sure...
Here is how the streamstats is working (just sample data, adding a table command for better representation).
index=euc_network90 sourcetype=era_full_syslog host=myhost | table _time |streamstats count
This will generate data like this
_time count xxxxxx 1 xxxxxx 2 xxxxxx 3 xxxxxx 4 ....
Adding timechart would actually add this serial number values and would give wrong/much higher count (instead of getting 4 as the event count, the result would show 10).
If you can use
tstats, then definitely do; it is much more efficient to gather your data from indexed metadata than by mining from inside of the events (buckets). This is a no-brainer. The problem is that many things cannot be done with