Splunk Dev

Can I parse my log with a python script before indexing?

drebai
Explorer

Hi!
I'm reading the scripted input documentation but I don't understand if they can help me in what I'd like to do.
I would like to be able to save some types of different logs in the same format.
Is it possible to use a python script to receive logs and parser them?
The logs are complex and to get a unique dashboard I first have to extract all the fields for each format and use custom search command (that I created already with intersplunk to created new fields).
I would prefer to do an initial parsing in order to extract the same fields from all sources and created new fields (and saved that).

Example: (it's just a simplified example of my situation)
format1: ###EXPECTED### {"field1":"value1} ###ACTUAL### {"field1":"value2","field2":"value1"}
format2: timestamp \n exp_field: {"field1":"value1}\n act_field {"field1":"value2","field2":"value1"}

In my dashboard I would like a count of different fields between jsons.
Now I need to extract the fileds with two different regExp and then use a custom command that extracts the different fields between the two jsons.
I would like to do everything before indexing. It's possible?

Thanks,
Deb

Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

drebai
Explorer

Thank you!
Are there any guides or examples?
I only find things concerning the exclusion of fields directly from the input.conf

0 Karma

Yunagi
Communicator

Here is an example:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptExample
As @richgalloway said, whatever goes to stdout (via "print") is what Splunk will index. So add a few lines in your Python script to format the output as needed.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...