HI,
I am facing a weird situation. I am executing a query that will give last one day data on hourly basis.
base search
| search type="data"
| timechart span=1h count(eval(HTTP_Status="202")) as today_count
when i execute this 1st time i got some httpd requests count.
When i execute above command 2nd time, i am seeing around 100-200 difference in the count for every hour.
Why happening this?
any help?
Thank you!
Try something like this
base search
| search type="data"
| bin _time span=1h
| where HTTP_Status="202"
| stats count as today_count by _time
I suspect that timechart
is moving the boundaries of the span
, although that should not happen with earliest=-24h@h
The above method should give you exactly the same number of records for an hour each time it is run across the same hour. If it does not, then you have (A) something going on with your data ingestion, or (B) something going on with records getting moved out of warm to cold or (C) indexers going offline for some reason.
Still am getting event count change on every run. Don't know the exact problem. I am observing this kind of change first time. our data is a real time log which will generate from mobile application.
May be different time zones causing this issue. Thanks.
Look at the difference between _time
and the indextime. If you are east of UTC, that does seem like the most likely scenario. Especially if you are in UTC+2 or so, and the data count is only changing in the last two hours.
See the following answer for information on seeing the difference between the timestamp and the index time:
https://answers.splunk.com/answers/540344/how-to-compute-indextime-time-difference-average-w.html
Thanks for your reply..Let me try it out.will get you back with the result.
Still am getting event count change on every run. Don't know the exact problem. I am observing this kind of change first time. our data is a real time log which will generate from mobile application.
@prathapkcsc, are you using pot-processing? Also is there is a reason for type="data" to be outside the base search? If you are just showing HTTP_Status=202, can you also add HTTP_Status="202"
to your base search?
Following will give you today's 202 events with span=1h (assuming by last one day you mean today as you have renamed the count also as today_count).
<yourBaseSearch> type="data" HTTP_Status=202 earliest=-0d@d latest=now
| timechart span=1h count as today_count
data is just like grep pattern from log data.
sourcetype="mylogdata" earliest=-5h@h latest=now HTTP_Status=202
| search type="data"
| timechart span=1h count
I am just giving the above query. some how values are changing on every search. At least am able to see 10-20 count difference.
I have no idea about pot-processing.
@prathapkcsc well if you are not using post processing there is nothing to investigate there. Can you try the following search and see the result counts?
sourcetype="mylogdata" earliest=-6h@h latest=-1h@h HTTP_Status=202 type="data"
| timechart span=1h count
If the counts are still not matching, can you verify the Time on Raw Data and Time picked by _time field whether they match or not?
I used the same command. Still am able to see the difference on every search.
Raw data has EDT time zone and _time field showing in UTC timezone.
Is this causing the issue?
Try the following
sourcetype="mylogdata" earliest=-1d@h-5 latest=-1d@h-3 HTTP_Status=202 type="data"
| timechart span=1h count
Also check (in verbose mode) whether your raw data has date_hour field extracted.
I am thinking for the following change might happen:
1) Latest hour bucket as data is still flowing in.
2) Your _time not picking up time correctly.
Because of which I which I wanted you to test the time which has already passed and for which already events have been indexed.
If date_hour is present you can also try stats count by date_hour
to see whether counts add up or not.
Hi I used the below query
sourcetype="mylogdata" type="data" earliest=-1d@h-5 latest=-1d@h-3 HTTP_Status="202"
|timechart span=1h count
The below output i got
_time count
2018-06-11 07:00 34
Just fyi, i have extracted fileds. What did i take over from the above query?
I am observing event count change for almost all the hours.
Also Please let me know the meaning of the below line
earliest=-1d@h-5 latest=-1d@h-3
Are you seeing any yellow triangles on the results page? Have you used the job inspector to see if there are any problems? Are you using the same time period, like yesterday, or are you using something like last 24 hours?
No. I am not seeing any yellow triangles. I am using same command with
earliest=-24h@h latest=now.
at 1st run :
19th hour --> 52168
20th hour ---> 121115
at 2nd run :
19th hour : 52153
20th hour : 121082
Like above event count is decreasing for every search.