Getting Data In

Is it possible to fix a scripted input once it's been indexed?

samian
Engager

I'm writing a Splunk App and looking for a few pointers on how to approach the following:

  • A scripted input requests events from a rest api.
  • Sometimes, but not often, an event needs to be corrected after it's been indexed.
  • Is this possible?
  • What I was thinking is that my input script could run a search against the API and then delete the old event and index the updated event.
  • This brings up the question of whether or not the input script would have the correct permissions to search / delete events via the rest api.
  • The events are stored in an index specific to this application.
  • The only other option would be to just index the updated event and use splunk search language to filter out old events and only look at ones with the most recent indexed-on-date
0 Karma
1 Solution

elliotproebstel
Champion

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

View solution in original post

elliotproebstel
Champion

It is technically possible to do what you are asking: create a service account that has permissions to run Splunk queries that use the delete command, which does not delete data from the underlying storage but does prevent the events from being returned in searches. Here's some documentation on this:
https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/RemovedatafromSplunk

And then it is possible to feed Splunk new events with the same timestamps as the prior events but with revised data.

However, this is a terrible idea. Allowing a service account to delete data at will is asking for trouble. Even though you can, you definitely should not.

By far, your best bet is to go with your final suggestion - to log all the events/data and use SPL to find the correct data. This means you should give some good thought now, while you are architecting your script and processes, to how you will correlate revised logs and clearly identify them.

richgalloway
SplunkTrust
SplunkTrust

Once events have been indexed they cannot be changed in any way. Your only option is as you've already surmised - index the updated event and use SPL to filter out old events and only look at ones with the most recent indexed-on-date.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...