Hello,
I have a scheduled saved search which populates a summary index with ~50M events. As the search is triggered, I monitor the progress in the Job Inspector. I noticed that it reaches 100% in about 30mins. After this point, the search fails to report any more progress. I have checked the search.log
and the last update is about the "StatsProcessor - flushed stats...."
into a gzipped file in the scheduler's job directory. I also checked this directory and in the file status.csv
it reports as "FINALIZING". Waiting a couple of hours and no progress.
My concern is that in the flushed results reported a count of ~9M events which is correct for based on the indexed events. However, I use timechart
and then untable
in order to fill empty buckets and this is the expansion to 50M events.
Also, scheduler's log in index _internal
does not report any error or whatever.
Is there another log/process I could check to gather more details? or any more ideas? My limits.conf are configured to handled such big searches as well.
thanks, Dimoklis
Ok... can you provide your search? Also, consider it running multiple times a day but processing different hours of the day to reduce the number of rows processed per run.
Ok... can you provide your search? Also, consider it running multiple times a day but processing different hours of the day to reduce the number of rows processed per run.
Did some work around on the way i aggregate the result with stats first. thanks @somesoni2. Marked as answer as it consists a general
best practice
Converting this to an answer because running the search multiple times a day will likely fix the problem. I've found that when you have a search that populates a summary index, it doesn't actually write any data to the summary index until the "finalizing" stage. So, Splunk taking a long time to write 50M results is not surprising to me.
I have to wonder what the use case is for doing this though, as there may be a better way to implement @dimoklis desired outcome. I usually use summary indexing to aggregate data, although I have seen it misused as a lazy way to filter the data within indexes into new indexes (instead of using event routing via props/transforms).
How frequent do you run your summary index search? 50 M is a lot of events and would be great if you can increase the frequency to reduce the no of events to be written.
Hi somesoni2 thanks for the reply. I am running it once a day in dead quiet period. Think there is a bottleneck with timechart/untable