Knowledge Management
Highlighted

Can a search that populates a summary index add a field to the raw data?

Engager

Hello,
I am populating a summary index with a search:
index=index1

| addinfo | collect index=summary

I want to schedule the above search to run multiple times a day, but due to the nature of the data, this will introduce duplicate events into the summary index. Is there a way for the populating search to add a field to index1, called isProcessed="true", so that the populating search can filter events by isnull(isProcessed) and duplicate events won't be added to the summary index?

0 Karma
Highlighted

Re: Can a search that populates a summary index add a field to the raw data?

SplunkTrust
SplunkTrust

Data once indexed can't be changed, so the answer is no. What you can do is to modify your summary index search so that it'll exclude events from index1 which are already available in sumary.
e.g. If you've a primary key unique field in the index=index1 events, your search will be like this

index=index1 NOT [search index=summary | stats count by primaryKeyField ] | addinfo | collect index=summary

Also, I would do more analysis on why there are duplicates. Do you've overlapping time range in your summary index search?

View solution in original post

Highlighted

Re: Can a search that populates a summary index add a field to the raw data?

Engager

@somesoni2
That's awesome. I wasn't successful in excluding events with the stats command. I changed to the table command and verified the search works. Thanks!
index=index1 NOT [search index=summary | table primaryKeyField ] | addinfo | collect index=summary

0 Karma