Splunk Search

Why is some data not displayed when the search time range is greater than 30 days?

cmerriman
Super Champion

If I'm looking at Last 30 Days of data for one event and doing a timechart, a couple of days come up with 0 as results. When I adjust my time range to look at those days (and surrounding days) to see what might be going on, everything looks completely normal and I have about the same number of events and the same averages/sums in my timechart as the other days surrounding it. I'm curious why a larger time range might cause some data to not display. It isn't a complex search. There are no joins, just one sourcetype with a few qualifiers and then the timechart. I've also tried to do a stats by date, but that didn't work either.

0 Karma

javiergn
Super Champion

Can you run the following and see if you have events every day?

sourcetype=details price>0 service=1203 status = 1
| fields _time, site, price
| bucket _time span=1d
| stats count, avg(price) by site, _time

In case that doesn't solve your problem, if you could post the output and what you were actually expecting it would definitely help.

0 Karma

cmerriman
Super Champion

When I adjust my time frame and only look at the surrounding days (the day before, the 2 missing days, and day after), everything is there. It is only when I look at the larger time frame that the data from the 2 days is dropped.

Expected:

date    dc(site)    avg(price)  count
2/8/2016    2533    2523.389272 16965
2/9/2016    2545    2823.037768 16575
2/10/2016   2612    2376.104439 16852
2/11/2016   2553    2349.573458 17037

Actual:

date    dc(site)    avg(price)  count
2/8/2016    2533    2523.389272 16965
2/9/2016    0       0
2/10/2016   0       0
2/11/2016   2553    2349.573458 17037
0 Karma

javiergn
Super Champion

That's very strange. Did you run the query I posted above and checked the numbers for those missing days in a larger time frame?

If I understand correctly, the following works:

 sourcetype=details price>0 service=1203 status = 1 earliest="02/08/2016:00:00:00" latest="02/12/2016:00:00:00"
| dedup site record
| timechart span=1d dc(site) avg(price) count

But the following doesn't, correct?

sourcetype=details price>0 service=1203 status = 1 earliest=-31d
| dedup site record
| timechart span=1d dc(site) avg(price) count

Can you try again without the dedup or if you want to use dedup, then bucket by time first:

sourcetype=details price>0 service=1203 status = 1 earliest=-31d
| bucket _time span=1d
| dedup site record _time
| stats dc(site) avg(price) count by _time
0 Karma

cmerriman
Super Champion

You're correct, any which way I've ran it, I'm losing those days, unless I focus in on a smaller timeframe. When I run just the sourcetype with nothing else for the entire 30 days, though, all days/events are there. I'm not sure if I'm running into a memory problem or something?

0 Karma

cmerriman
Super Champion

the following seems to work, for whatever reason. I took out the dedup and used a stats command instead and removed all the qualifiers at the beginning and added them to the search after the initial stats command. The results aren't exactly the same, but they're pretty close. I have a feeling it's because the dedup works a little differently than my values command. I need keep the most recent event by site and record (which is why originally I dedup site, record) and then keep only the events with a status of 1.

sourcetype=details|eval date=strftime(_time,"%D")|stats values(status) as status by site record price service date |search status=1 price>0 service=1203|stats dc(site) avg(price) count by date

0 Karma

javiergn
Super Champion

Because you are performing a dedup very early and ignoring the time field, you are going to be deleting values where site and record are shared, but different times.

For example:

date, site, record
January, siteA, recordA
February, siteA, recordA
February, siteB, recordB
February, siteC, recordC
March, siteA, recordA

When you dedup based on site and record only you will get:

date, site, record
January, siteA, recordA
February, siteB, recordB
February, siteC, recordC

Which is the first unique value of site and record.
Does that make sense?

0 Karma

javiergn
Super Champion

Hi, can you post your query or an obfuscated version of your query here?

0 Karma

cmerriman
Super Champion

sourcetype=details price>0 service=1203 | dedup site record| search status=1 | timechart span=1d dc(site) avg(price) count

I've attempted to simplify it without the dedup and search, but that didn't work either.

0 Karma
Get Updates on the Splunk Community!

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...

Explore the Latest Educational Offerings from Splunk [January 2025 Updates]

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...