Splunk Search

Different distinct count for stats and timechart (same time interval)

hemendralodhi
Contributor

Hello,

For same base query I am getting different distinct count result in timechart and stats for same time range (old time to mitigate any new events coming in)

stats - query - mysearch | stats dc(field)

I ran the query for 2 hours between 16:00 - 18:00 and getting result as 507

Result ( Running it individually for 1 hr)
16:00 - 17:00 - 293
17:00 - 18:00 - 223
** Difference of 9 when compared to running it for complete 2 hr.

timechart - query - mysearch | timechart span=1h dc(Field)

Result ( For whole 2 hours)
16:00 - 293
17:00 - 223

Result ( Running it individually for 1 hr)
16:00 - 293
17:00 - 223

It seem stats gives correct result if searched for separate 1 hr interval but not for running full 2 hours.

I am at loss here what is happening?

Please advise.

Thanks

0 Karma

knielsen
Contributor

I don't see a problem with the numbers you posted. Whenever Field is reused after one hour, it will be contributing to the distinct count for each hour, but you only get it once when looking at the full two hours.

hemendralodhi
Contributor

Thanks for the response . If you see the total count vs individual count it is different for stats. Running stats for 2 hrs count=507. Running it individually count = 293 + 223=516

0 Karma

knielsen
Contributor

Yes, and that is totally fine. Consider this example with 3 events in 2 hours:

16:00 field=123
16:01 field=234
17:01 field=123

If you do a dc(field) for first hour, you get 2 as result, because you have 2 different values for field. If you do a dc(field) for the the second hour alone, you get 1 as result, because there is only one value of course. that doesn't mean though, that you get 3 as result if you do a dc(field) for the whole time. The result is 2, there are still only two distinct values for field. So having the sum of dc() first hour and dc() second hour which is 3 is different than the dc() over the whole time range. That is perfectly fine.

Your example tells us that 9 field values happened both in the first and the second hour, the rest were distinct for each hour.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi hemendralodhi,

could you share the timeranges you used in stats and timecharts?

do you have the same result if you run your stats search now (maybe there are later indexes events )?

Bye.
Giuseppe

0 Karma

hemendralodhi
Contributor

Thanks for your response. I ran the search with time range few hours back.

Time Range Used : 16:00:00.000 - 18:00:00.000

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

What Is Splunk? Here’s What You Can Do with Splunk

Hey Splunk Community, we know you know Splunk. You likely leverage its unparalleled ability to ingest, index, ...

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...