I'm reading through https://docs.splunk.com/Documentation/Splunk/7.0.1/Troubleshooting/Aboutmetricslog trying to put together some analysis of forwarders in a timechart. I'm noticing that the tcp_KBps numbers are always higher than I would expect them to be. Shouldn't they essentially just be kb/60, if my bucket were 1 minute for instance?
Take the following two timecharts:
index=_internal sourcetype=splunkd host=*hf* group=tcpin_connections hostname=* | timechart span=1m sum(tcp_KBps) as "KBps" by host limit=50 index=_internal sourcetype=splunkd host=*hf* group=tcpin_connections hostname=* | timechart span=1m sum(kb) as "KB" by host limit=50
Shouldn't I expect in this case that the bottom results should be essentially 60x per host per bucket compared to the top? That's not what I'm seeing. I see them follow the same trends however the numbers do not add up. For instance I will see a kb value of 676,000 and a kbps of 27,500 for the same host on the same minutely time bucket. Shouldn't the kbps be around 11,266? What am I missing here?
Sorry guys for bringing this old topic up, but as the main question remained kinda unexplained, I'd be glad to see it finally clarified. I've to deal with the same confusion about the values and it drives me crazy, as it makes thruput troubeshooting really annoying. I've got a Splunk 7.3.3 in front of me running for several hours with Metric events that are gathered within a 60s time interval:
[...] kb=95143.5849609375, _tcp_Bps=2064499.2670550928, _tcp_KBps=2016.112565483489, _tcp_avg_thruput=2016.112565483489, _tcp_Kprocessed=95143.5849609375, _tcp_eps=2806.537391302746 [...]
Latest docs (currently 8.01) still say what brian has quoted above:
But no matter how I try to align _tcp_KBps and kb (bits or bytes), they don't fit:
(95143 / 8 bit) / 60 sec ~ 198
95143 / 60 sec ~ 1586, which is still far away from the 2016
Funny to mention: _tcp_avg_thruput is identical to _tcp_KBps in my example, but should be measured in bytes(!) according to the documentation.
Any idea, why these values don't line up and which one to trust?
Are you sure about that? From the doc:
_tcp_Bps is the bytes transmitted during the metrics interval divided by the duration of the interval (in seconds)
_tcp_KBps is the same value divided by 1024
_tcp_avg_thruput is an average rate of bytes sent since the last time the tcp output processor was reinitialized/reconfigured. Typically this means an average since Splunk started.
_tcp_KProcessed is the total number of bytes written since the processor was reinitialized/reconfigured, divided by 1024.
_tcp_eps is the number of items transmitted during the interval divided by the direction of the interval (in seconds). Note that items will frequently not be events for universal/light forwarders (instead, data chunks)
kb is the bytes transmitted during the metrics interval divided by 1024.