I have a scheduled search that create statistics based on individual files. These searches run once per hour.
ie. a log comes in. Stats are generated about the event inside this file.
My problem is that events in this file can on average range from a few hours old to DAYS old (oldest one I found was 45 but closer to a dozen days).
So the main problem is that events are being inserted into the past. I had previously looked at the answers posts in regard to late events and none of them helped me.
I have found a way around this issue by specifically only generating statistics for that batch file once I see a specific "HDR" (header) text marker. By only looking for specific source files the search runs very fast and only looks at applicable events. Also by specifying a large search window manually (to override the 1 hour time window used by the savedsearch) it is able to capture all the events that were originally contained inside that source file.
ie. index=ivr [search index="ivr" "HDR," earliest=-1h latest=now| fields source ] earliest=-365d latest=+365d | dedup _raw | REX field=source "(? ain\S+[a-zA-Z0-9]+$)" | bucket _time span=1h | stats first(date_mday) AS FirstDay, last(date_mday) AS LastDay, first(date_month) AS Month, first(date_year) AS Year, last(date_hour) AS Hour, last(date_minute) AS Minute, first(_time) AS File_Time, some other stats calcs here by Type Batch
So in a nutshell "if you see a header marker find ALL the events from ALL(or the equivalent of) time for that particular source file and calculate statistics on it."
Now this is fine and actually captures the data "per file" but has the side effect of breaking how summary index results are searched for.
By hardcoding the earliest and latest parameters the summary index is now saving its stash entries with dates from 1 year ago. So if a search for summary results run in the last week is performed I will get no results. However if I check 1 year ago there they are. This is because the "info_min_time" is used by the normal search as the "_time" value by default. As this is metadata I don't think there is anyway that I can change it for already created summary index results. So its not possible for me to do something like "| rename info_min_time as _time".
Does anyone know how I can do trick a search summary results search to use another time field to search on OR how to modify a scheduled search to perform searches outside of its running time window without breaking the summary index results as I already have?
... View more