All Apps and Add-ons

Transform log file or field at index time using script/python instead of at search time?

rnauman
Explorer

I have a base64 field in my IIS log file. There are 3 very important properties within the base64 string that I want to extract at index time. It looks like everything available within splunk will be translated at search time and not added to the index.

What I don't want to have to do is manage a scheduled process (windows) on each server to run a transform script on the log, make sure it ran, process it intelligently to avoid re-processing already translated rows, having splunk monitor the translated log instead, etc. This was largely the purpose of Splunk.

I would even be ok if splunk orchestrated running the transform script if it couldn't directly do the decode at index time. E.g., splunk runs this script before indexing.

I am currently using a search app to do the decoding with python but doing nothing more than calling the following is a 13-15x performance hit. I want to be able to filter based off of these 3 decoded properties and that makes this approach unacceptable.

results = splunk.Intersplunk.getOrganizedResults()
for r in results
    // do nothing

Any help or suggestions are appreciated

Ayn
Legend

It sounds like a scripted input would meet your requirements? http://docs.splunk.com/Documentation/Splunk/6.0/Data/Setupcustominputs

Create a scripted input that runs on whatever interval you want, Splunk will ingest whatever output it has and will index the translated log data.

0 Karma

zsavushkin
Engager

Any suggestions?

0 Karma

tcador
New Member

Was there ever an answer to this? I'm in the same situation as this:

"Ideally I'd have all the scripting run on the single indexer from the variable number of forwarder-provided-data. Is that possible? E.g. forwarder sends raw log -> indexer -> decode log from forwarder(s) -> index"

0 Karma

rnauman
Explorer

I'm using universal forwarders on each of the target machines. This gets sent via TCP to the single indexer instance. I'm not sure if scripted inputs will run on light forwarders. Ideally I'd have all the scripting run on the single indexer from the variable number of forwarder-provided-data. Is that possible? E.g. forwarder sends raw log -> indexer -> decode log from forwarder(s) -> index