I have a Python script that queries an external system for reputation data based on a hash. What I would like to do is for every event being ingested by Splunk, execute this script prior to index and add a field to the raw event containing the hash reputation. The events are a combination of XML and CEF syslog currently, and currently there are only two fields that could contain a hash value.
Ideally what would happen, probably on the HFW, would be:
This would allow me to query for all hashes with a certain reputation in Splunk and perform other operations without having to perform a lookup. My thought was that this could be accomplished by props.conf and transforms.conf calling an external script, similar to an external lookup, but I'm not sure how to configure Splunk to do this without using a lookup table and to add the field to the event prior to ingest?
Does the additional field have to be added at index time or can you add the field at search time?
I would probably not add this field at index time. I would add this field at search time. The reasons are:
I save storage space because I am no longer storing the reputation field in the index
I can change the event's reputation in the future. If you add the reputation field it will always be there and cannot be changed (easily). If the reputation for this event was assigned incorrectly or maybe the event's reputation needs to be re-classified you can simply update your reputation field extraction to assigned the correct value to the reputation field.