Getting Data In

Script to filter events at index time

ktn01
Path Finder

Is it possible to use a python script to perform transforms during event indexing?

My aim is to remove keys from json files to reduce volume. I'm thinking of using a python script that decodes the json, modifies the resulting dict and then encodes the result in a new json that will be indexed.

Labels (1)
0 Karma

jawahir007
Communicator

Yes, you can achieve this by using a Python script as a scripted input in Splunk. You can read the data using Python, perform the modifications as you described (decoding the JSON, updating the dictionary, and re-encoding it), and output the modified data.

Here's how it works:

  1. Create a Python Script:

    • Read the incoming data.
    • Apply the necessary transformations.
    • Print the modified JSON to standard output (stdout).
  2. Configure Scripted Input in Splunk:

    • Go to Settings > Data Inputs > Scripts.
    • Add a new scripted input and select your Python script.
    • Set a cron schedule for when the script should run.

The script will run at the configured intervals, fetch the data, apply your changes, and send the transformed data to Splunk for indexing.

Important Consideration:
The main limitation is that data ingestion will depend on the cron schedule of the scripted input, so real-time or very frequent data processing might not be achievable. Adjust the schedule as needed based on your data update frequency.

ktn01
Path Finder

Thank you for your reply.
I can't pre-process the events before ingestion in Splunk because they are directly sent by an appliance to a hec input.
Christian

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Is it possible to ask that sender reduce content of HEC event or is it used somewhere else also?
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @ktn01 ,

the only solution is apply INGEST_EVAL rules to your input, instead a python script.

Ciao.

Giuseppe

gcusello
SplunkTrust
SplunkTrust

Hi @ktn01 ,

yes, it's possible, but it isn't related to Splunk because it pre-processes data before ingestion: I did it for a customer.

Put attention to one issue: changing the format of your logs, you have to completely rebuild the parsing rules for your data because the standard parsing rules aren't still applicable to the new data format.

Ciao.

Giuseppe

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...