Splunk Dev

Can I parse my log with a python script before indexing?

drebai
Explorer

Hi!
I'm reading the scripted input documentation but I don't understand if they can help me in what I'd like to do.
I would like to be able to save some types of different logs in the same format.
Is it possible to use a python script to receive logs and parser them?
The logs are complex and to get a unique dashboard I first have to extract all the fields for each format and use custom search command (that I created already with intersplunk to created new fields).
I would prefer to do an initial parsing in order to extract the same fields from all sources and created new fields (and saved that).

Example: (it's just a simplified example of my situation)
format1: ###EXPECTED### {"field1":"value1} ###ACTUAL### {"field1":"value2","field2":"value1"}
format2: timestamp \n exp_field: {"field1":"value1}\n act_field {"field1":"value2","field2":"value1"}

In my dashboard I would like a count of different fields between jsons.
Now I need to extract the fileds with two different regExp and then use a custom command that extracts the different fields between the two jsons.
I would like to do everything before indexing. It's possible?

Thanks,
Deb

Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

drebai
Explorer

Thank you!
Are there any guides or examples?
I only find things concerning the exclusion of fields directly from the input.conf

0 Karma

Yunagi
Communicator

Here is an example:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptExample
As @richgalloway said, whatever goes to stdout (via "print") is what Splunk will index. So add a few lines in your Python script to format the output as needed.

Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...