I am using distinct count with time chart for the whole day (yesterday). The result is varying if the span is changed to 1h and 1d.
| timechart span=1h dc(SessionId) as "TotalCustomerSession"| addcoltotals labelfield=_time label=Total| sort -_time
gives me a result 10802
| timechart span=1h dc(SessionId) as "TotalCustomerSession"| addcoltotals labelfield=_time label=Total| sort -_time
gives me result 10399
Confused!
I notice that both your queries above say "span=1h". Is the second one - the one with the lower result - supposed to be "span=1d"?
If so, here's a possibility:
For span=1h, it counts the distinct Session IDs in each hour and sums them up.
For span=1d, it's counting all the distinct IDs in the day.
Let's say you have Session ID 12345 that appears at 2 pm and at 7 pm. Perhaps with span=1h it gets counted twice - once for the 2pm span and once for the 7 pm span. But with span=1d it just gets counted once.
What is your timerange set to? Are you saying that if you sum the hourly distinct counts and compare that summed value with the 1 day value, the unique count of IP's are different? Are you snapping time?
You should add this to the top line of your search to check
earliest-1d@d latest=@d
I am saying that distinct_count with timechart span=1h is giving more count when compared with timechart span=1d.(correct value)
Are you seeing same number of events for yesterday in both the searches?
when you specify span, splunk looks for related buckets to pull the events, so there will be a difference in count
If I use dedup and count by hour I see the result math with span=1d. But with DC it is not.