Splunk Search

"stats count" and other summary commands report ~2x the actual counts

tschrantz
New Member

We have a new sourcetype that's using the AWS Add-on to grab data from S3 (SQS-based). Whenever we do a stats count or timechart or similar statistical command, we'll get counts that are 2-4x the actual data. For instance, if we do a stats count by a unique ID field, most of the IDs return a count of 2, but when we drill down, there's only one matching record. When we do a "top", the percents shown add up to well over 100%.

We've confirmed that there are no duplicates in the source data. None of our other sources seem to have this problem.

Also of note: When we set up data model acceleration and use tstats, the numbers are correct.

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

martin_mueller
SplunkTrust
SplunkTrust

Usually you'll want to keep indexed extractions and turn off the search-time duplication.

0 Karma

tschrantz
New Member

I just set INDEXED_EXTRACTIONS=none and that looks like it solved it, and it's still recognized as JSON. Thanks for putting me on the right path!

0 Karma

tschrantz
New Member

It is JSON data. We do have INDEXED_EXTRACTIONS=json, but we don't have KV_MODE=json set. We did have AUTO_KV_JSON=true, but I removed that and it didn't seem to make a difference.

You are on the right track with the multi-value fields, though. I piped my search through | table id and the values were duplicated in the output.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...