Getting Data In

Is it possible to fix a scripted input once it's been indexed?

samian
Engager

I'm writing a Splunk App and looking for a few pointers on how to approach the following:

  • A scripted input requests events from a rest api.
  • Sometimes, but not often, an event needs to be corrected after it's been indexed.
  • Is this possible?
  • What I was thinking is that my input script could run a search against the API and then delete the old event and index the updated event.
  • This brings up the question of whether or not the input script would have the correct permissions to search / delete events via the rest api.
  • The events are stored in an index specific to this application.
  • The only other option would be to just index the updated event and use splunk search language to filter out old events and only look at ones with the most recent indexed-on-date
0 Karma
1 Solution

elliotproebstel
Champion

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

View solution in original post

elliotproebstel
Champion

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

richgalloway
SplunkTrust
SplunkTrust

Once events have been indexed they cannot be changed in any way. Your only option is as you've already surmised - index the updated event and use SPL to filter out old events and only look at ones with the most recent indexed-on-date.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...