I have a search I created that runs for the last 5 minutes. I scheduled this to run every 5 minutes to update a summary index I have. I am finding duplicates on my data. Is there a better way to manage the run times to make sure there are no duplicates? I am only putting a table into the summary, there are no stat commands. Is there some way to handle this better?
are your duplicates only in the last seconds of the five minutes or in all the period?
if duplicates are only in the last period, you could insert in your main search a subsearch from the summary index excluding events alreadi indexed, in other words something like this:
index=my_index NOT [ search my_symmaty_index | fields _time field1 field2 field3 ] | table _time field1 field2 field3 | collect index=my_summary_index
If instead are in a larger period, you should analyze again you data and your search.