Splunk Search

"stats count" and other summary commands report ~2x the actual counts

tschrantz
New Member

We have a new sourcetype that's using the AWS Add-on to grab data from S3 (SQS-based). Whenever we do a stats count or timechart or similar statistical command, we'll get counts that are 2-4x the actual data. For instance, if we do a stats count by a unique ID field, most of the IDs return a count of 2, but when we drill down, there's only one matching record. When we do a "top", the percents shown add up to well over 100%.

We've confirmed that there are no duplicates in the source data. None of our other sources seem to have this problem.

Also of note: When we set up data model acceleration and use tstats, the numbers are correct.

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

martin_mueller
SplunkTrust
SplunkTrust

Usually you'll want to keep indexed extractions and turn off the search-time duplication.

0 Karma

tschrantz
New Member

I just set INDEXED_EXTRACTIONS=none and that looks like it solved it, and it's still recognized as JSON. Thanks for putting me on the right path!

0 Karma

tschrantz
New Member

It is JSON data. We do have INDEXED_EXTRACTIONS=json, but we don't have KV_MODE=json set. We did have AUTO_KV_JSON=true, but I removed that and it didn't seem to make a difference.

You are on the right track with the multi-value fields, though. I piped my search through | table id and the values were duplicated in the output.

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...