Knowledge Management

Can a search that populates a summary index add a field to the raw data?

vpao
Engager

Hello,
I am populating a summary index with a search:
index=index1

| addinfo | collect index=summary

I want to schedule the above search to run multiple times a day, but due to the nature of the data, this will introduce duplicate events into the summary index. Is there a way for the populating search to add a field to index1, called isProcessed="true", so that the populating search can filter events by isnull(isProcessed) and duplicate events won't be added to the summary index?

0 Karma
1 Solution

somesoni2
Revered Legend

Data once indexed can't be changed, so the answer is no. What you can do is to modify your summary index search so that it'll exclude events from index1 which are already available in sumary.
e.g. If you've a primary key unique field in the index=index1 events, your search will be like this

index=index1 NOT [search index=summary | stats count by primaryKeyField ] | addinfo | collect index=summary

Also, I would do more analysis on why there are duplicates. Do you've overlapping time range in your summary index search?

View solution in original post

somesoni2
Revered Legend

Data once indexed can't be changed, so the answer is no. What you can do is to modify your summary index search so that it'll exclude events from index1 which are already available in sumary.
e.g. If you've a primary key unique field in the index=index1 events, your search will be like this

index=index1 NOT [search index=summary | stats count by primaryKeyField ] | addinfo | collect index=summary

Also, I would do more analysis on why there are duplicates. Do you've overlapping time range in your summary index search?

vpao
Engager

@somesoni2
That's awesome. I wasn't successful in excluding events with the stats command. I changed to the table command and verified the search works. Thanks!
index=index1 NOT [search index=summary | table primaryKeyField ] | addinfo | collect index=summary

0 Karma
Get Updates on the Splunk Community!

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL  The Splunk AI Assistant for SPL ...

Buttercup Games: Further Dashboarding Techniques (Part 5)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...

Customers Increasingly Choose Splunk for Observability

For the second year in a row, Splunk was recognized as a Leader in the 2024 Gartner® Magic Quadrant™ for ...