Splunk Dev

Can I parse my log with a python script before indexing?

drebai
Explorer

Hi!
I'm reading the scripted input documentation but I don't understand if they can help me in what I'd like to do.
I would like to be able to save some types of different logs in the same format.
Is it possible to use a python script to receive logs and parser them?
The logs are complex and to get a unique dashboard I first have to extract all the fields for each format and use custom search command (that I created already with intersplunk to created new fields).
I would prefer to do an initial parsing in order to extract the same fields from all sources and created new fields (and saved that).

Example: (it's just a simplified example of my situation)
format1: ###EXPECTED### {"field1":"value1} ###ACTUAL### {"field1":"value2","field2":"value1"}
format2: timestamp \n exp_field: {"field1":"value1}\n act_field {"field1":"value2","field2":"value1"}

In my dashboard I would like a count of different fields between jsons.
Now I need to extract the fileds with two different regExp and then use a custom command that extracts the different fields between the two jsons.
I would like to do everything before indexing. It's possible?

Thanks,
Deb

Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

drebai
Explorer

Thank you!
Are there any guides or examples?
I only find things concerning the exclusion of fields directly from the input.conf

0 Karma

Yunagi
Communicator

Here is an example:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptExample
As @richgalloway said, whatever goes to stdout (via "print") is what Splunk will index. So add a few lines in your Python script to format the output as needed.

Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...