I do not figure out how I can configure summary indexing in my situation. Let me introduce my situation :
I do not index "live log". I index every day at 6am, compressed data one day old. For instance, january 11th at 6am, I index data from january 9th 6am to 10th 6am. This process can not be change for many reasons.
I have a big amount a data, so search take a long time to be process for a long period (typically on month). So, I used summary indexing to improve search time & resulting dashboard.
My problem is when I configure a summary indexing to process log (at midnight) from previous 24h, there is no log. If I set to 48 hour, it process a part of the log. If I set to 72, it will process new log added during the morning (well) and more already summary indexing.
Is this a problem ? Can the process figure out that the indexed data have been already summary indexing or it will do it again and make my result wrong ?
You could use the backfill script (fill_summary_index.py) to fill in the missing summary indexes. This does work out the time slices for which summary data already exists and only generates the missing summary indexes.
You would have to schedule this outside splunk but it would work.