Splunk Dev

Can I parse my log with a python script before indexing?

drebai
Explorer

Hi!
I'm reading the scripted input documentation but I don't understand if they can help me in what I'd like to do.
I would like to be able to save some types of different logs in the same format.
Is it possible to use a python script to receive logs and parser them?
The logs are complex and to get a unique dashboard I first have to extract all the fields for each format and use custom search command (that I created already with intersplunk to created new fields).
I would prefer to do an initial parsing in order to extract the same fields from all sources and created new fields (and saved that).

Example: (it's just a simplified example of my situation)
format1: ###EXPECTED### {"field1":"value1} ###ACTUAL### {"field1":"value2","field2":"value1"}
format2: timestamp \n exp_field: {"field1":"value1}\n act_field {"field1":"value2","field2":"value1"}

In my dashboard I would like a count of different fields between jsons.
Now I need to extract the fileds with two different regExp and then use a custom command that extracts the different fields between the two jsons.
I would like to do everything before indexing. It's possible?

Thanks,
Deb

Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

drebai
Explorer

Thank you!
Are there any guides or examples?
I only find things concerning the exclusion of fields directly from the input.conf

0 Karma

Yunagi
Communicator

Here is an example:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptExample
As @richgalloway said, whatever goes to stdout (via "print") is what Splunk will index. So add a few lines in your Python script to format the output as needed.

Get Updates on the Splunk Community!

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...

Enterprise Security Content Update (ESCU) | New Releases

In April, the Splunk Threat Research Team had 2 releases of new security content via the Enterprise Security ...