Is it possible to use a python script to perform transforms during event indexing?
My aim is to remove keys from json files to reduce volume. I'm thinking of using a python script that decodes the json, modifies the resulting dict and then encodes the result in a new json that will be indexed.
Yes, you can achieve this by using a Python script as a scripted input in Splunk. You can read the data using Python, perform the modifications as you described (decoding the JSON, updating the dictionary, and re-encoding it), and output the modified data.
Here's how it works:
Create a Python Script:
Configure Scripted Input in Splunk:
The script will run at the configured intervals, fetch the data, apply your changes, and send the transformed data to Splunk for indexing.
Important Consideration:
The main limitation is that data ingestion will depend on the cron schedule of the scripted input, so real-time or very frequent data processing might not be achievable. Adjust the schedule as needed based on your data update frequency.
Thank you for your reply.
I can't pre-process the events before ingestion in Splunk because they are directly sent by an appliance to a hec input.
Christian
Hi @ktn01 ,
the only solution is apply INGEST_EVAL rules to your input, instead a python script.
Ciao.
Giuseppe
Hi @ktn01 ,
yes, it's possible, but it isn't related to Splunk because it pre-processes data before ingestion: I did it for a customer.
Put attention to one issue: changing the format of your logs, you have to completely rebuild the parsing rules for your data because the standard parsing rules aren't still applicable to the new data format.
Ciao.
Giuseppe