Getting Data In

Script to filter events at index time

ktn01
Path Finder

Is it possible to use a python script to perform transforms during event indexing?

My aim is to remove keys from json files to reduce volume. I'm thinking of using a python script that decodes the json, modifies the resulting dict and then encodes the result in a new json that will be indexed.

Labels (1)
0 Karma

Jawahir
Communicator

Yes, you can achieve this by using a Python script as a scripted input in Splunk. You can read the data using Python, perform the modifications as you described (decoding the JSON, updating the dictionary, and re-encoding it), and output the modified data.

Here's how it works:

  1. Create a Python Script:

    • Read the incoming data.
    • Apply the necessary transformations.
    • Print the modified JSON to standard output (stdout).
  2. Configure Scripted Input in Splunk:

    • Go to Settings > Data Inputs > Scripts.
    • Add a new scripted input and select your Python script.
    • Set a cron schedule for when the script should run.

The script will run at the configured intervals, fetch the data, apply your changes, and send the transformed data to Splunk for indexing.

Important Consideration:
The main limitation is that data ingestion will depend on the cron schedule of the scripted input, so real-time or very frequent data processing might not be achievable. Adjust the schedule as needed based on your data update frequency.

ktn01
Path Finder

Thank you for your reply.
I can't pre-process the events before ingestion in Splunk because they are directly sent by an appliance to a hec input.
Christian

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Is it possible to ask that sender reduce content of HEC event or is it used somewhere else also?
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @ktn01 ,

the only solution is apply INGEST_EVAL rules to your input, instead a python script.

Ciao.

Giuseppe

gcusello
SplunkTrust
SplunkTrust

Hi @ktn01 ,

yes, it's possible, but it isn't related to Splunk because it pre-processes data before ingestion: I did it for a customer.

Put attention to one issue: changing the format of your logs, you have to completely rebuild the parsing rules for your data because the standard parsing rules aren't still applicable to the new data format.

Ciao.

Giuseppe

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...