Splunk Search

How to create a search for multiple earliest dates for 7day output?

kpavan
Path Finder

Hi All,

Am looking for query to have multiple earliest days 

index=something sourcetype=something earliest=-7d@d latest=@d
| timechart span=1d dc(id) as total

its giving output as 

2022-08-31 13548
2022-09-01 13438
2022-09-02 13782
2022-09-03 9831
2022-09-04 13602
2022-09-05 12856
2022-09-06 12849

 

But actual data per day is something above 25k, but because of data is getting split so number showing very less per day wise as above table.

If i use 

index=something sourcetype=something earliest=-7d@d latest=@d
| stats dc(id) as total

output is 26894

index=something sourcetype=something earliest=-8d@d latest=-1@d
| stats dc(id) as total

output 27099

so on, if I change earliest and latest to get last 7 days i get above 25k or 26k but if use timechart then its half the number.

It would be great help If anyone has query to get correct output within single query.

Thanks in advance!

Labels (2)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

@kpavan wrote:

But actual data per day is something above 25k, but because of data is getting split so number showing very less per day wise as above table.

... so on, if I change earliest and latest to get last 7 days i get above 25k or 26k but if use timechart then its half the number.

I think you misunderstood what timechart span=1d does.  The problem does not exist.  Let me break down a little.

First, if you perform 

index=something sourcetype=something earliest=-1d@d latest=-0d@d
| stats max(_time) as _time dc(id) as total

you'll get something like 

2022-09-0612849

Then, perform

index=something sourcetype=something earliest=-2d@d latest=-1d@d
| stats max(_time) as _time dc(id) as total​

output will look like

2022-09-0512856

and so on.  The point is, this sequence is exactly what timechart span=1d@d does.  Timechart does not reduce counts by half; it simply performs the count day by day, day after day.

Secondly, why does 

index=something sourcetype=something earliest=-7d@d latest=@d
| stats dc(id) as total

end up only 26894 instead of the sum of the 7 days of timechart, i.e., ~ 90,000? That's because you are performing distinct count (dc).   There are large number of overlaps in field id day over day. For example, if on day one, id A, B, and C appears, on day two, A, C, and D appears, your dc(id) will be 3 on both days individually; that's what timechart span=1d will show.  But if you set earliest=-2d and perform dc(id), the output will be 4.

Do another experiment:

index=something sourcetype=something earliest=-7d@d latest=@d
| timechart span=7d dc(id) as total

This will give you that magical number ~27,000.

All this is a long way to say that timechart span=1d is really giving correct results (as far as dc is concerned).

ITWhisperer
SplunkTrust
SplunkTrust

What were you expecting as a correct result as the values you shown are not inconceivably consistent.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please explain what is meant by "multiple earliest days".  A query can have only one earliest_time setting.

I think the distinct_count (dc) function may be confusing the matter.  It may be normal for a span of 7 or 8 days to have 26000+ unique values for a field, but for each day in that same range to have far less.  It merely means id values are repeated over the days.  If you use count instead of distinct_count then you should see the totals for each day add up to the count for all 7 or 8 days.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...