Getting Data In

Is it possible to fix a scripted input once it's been indexed?

Engager

I'm writing a Splunk App and looking for a few pointers on how to approach the following:

  • A scripted input requests events from a rest api.
  • Sometimes, but not often, an event needs to be corrected after it's been indexed.
  • Is this possible?
  • What I was thinking is that my input script could run a search against the API and then delete the old event and index the updated event.
  • This brings up the question of whether or not the input script would have the correct permissions to search / delete events via the rest api.
  • The events are stored in an index specific to this application.
  • The only other option would be to just index the updated event and use splunk search language to filter out old events and only look at ones with the most recent indexed-on-date
0 Karma
1 Solution

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

View solution in original post

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

View solution in original post

SplunkTrust
SplunkTrust

Once events have been indexed they cannot be changed in any way. Your only option is as you've already surmised - index the updated event and use SPL to filter out old events and only look at ones with the most recent indexed-on-date.

---
If this reply helps you, an upvote would be appreciated.
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!