Solved: Re: "stats count" and other summary commands repor...

tschrantz · ‎01-30-2018

We have a new sourcetype that's using the AWS Add-on to grab data from S3 (SQS-based). Whenever we do a stats count or timechart or similar statistical command, we'll get counts that are 2-4x the actual data. For instance, if we do a stats count by a unique ID field, most of the IDs return a count of 2, but when we drill down, there's only one matching record. When we do a "top", the percents shown add up to well over 100%.

We've confirmed that there are no duplicates in the source data. None of our other sources seem to have this problem.

Also of note: When we set up data model acceleration and use tstats, the numbers are correct.

martin_mueller · ‎01-30-2018

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

View solution in original post

martin_mueller · ‎01-30-2018

The unique ID field (and possibly all others) most likely are multi-value, with the value contained twice... I'm guessing that's json data, and you have both INDEXED_EXTRACTIONS = json and KV_MODE = json set? That would cause this behaviour.

martin_mueller · ‎01-31-2018

Usually you'll want to keep indexed extractions and turn off the search-time duplication.

tschrantz · ‎01-31-2018

I just set INDEXED_EXTRACTIONS=none and that looks like it solved it, and it's still recognized as JSON. Thanks for putting me on the right path!

tschrantz · ‎01-31-2018

It is JSON data. We do have INDEXED_EXTRACTIONS=json, but we don't have KV_MODE=json set. We did have AUTO_KV_JSON=true, but I removed that and it didn't seem to make a difference.

You are on the right track with the multi-value fields, though. I piped my search through | table id and the values were duplicated in the output.

"stats count" and other summary commands report ~2x the actual counts

.conf25 Registration is OPEN!

Detecting Cross-Channel Fraud with Splunk

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Are you a member of the Splunk Community?

"stats count" and other summary commands report ~2x the actual counts

.conf25 Registration is OPEN!

Detecting Cross-Channel Fraud with Splunk

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside