I have a Python script that queries an external system for reputation data based on a hash. What I would like to do is for every event being ingested by Splunk, execute this script prior to index and add a field to the raw event containing the hash reputation. The events are a combination of XML and CEF syslog currently, and currently there are only two fields that could contain a hash value.
Ideally what would happen, probably on the HFW, would be:
This event comes in:
<title>Event 1</title>
<hash>123ac898aaa809f90a</hash>
This event goes out for Splunk to index:
<title>Event 1</title>
<hash>123ac898aaa809f90a</hash>
<reputation>Known Bad</reputation
This would allow me to query for all hashes with a certain reputation in Splunk and perform other operations without having to perform a lookup. My thought was that this could be accomplished by props.conf and transforms.conf calling an external script, similar to an external lookup, but I'm not sure how to configure Splunk to do this without using a lookup table and to add the field to the event prior to ingest?
Does the additional field have to be added at index time or can you add the field at search time?
I would probably not add this field at index time. I would add this field at search time. The reasons are: