OK, I've got a stream of, potentially, over 100 different event formats that I want to send into Splunk. Inside each event I specify the sourcetype I'd like splunk to use to process them - it's the only reliable way to determine the format of the rest of the fields, including such things as the timestamp and searchable fields.
They come in over a TCP port where they get a sourcetype of GENERIC. The stanzas in props.conf and transforms.conf then fire and correctly change the sourcetype to the one embedded in the event. But none of the rules I've tried coding to fire off of the now correctly set sourcetype will fire.
My conclusion after 2-3 days of poking around with Splunk and reading various questions and answers is that it's actually impossible. Can someone confirm that this is due to Splunks design - basically the transforms that process the sourcetype changes are done to late in the process for it to go back and honor the timestamp and field extraction rules for the new sourcetype?
CLONESOURCETYPE=$1 in the sourcetype setting transform doesn't seem to work (and, even if it did, I suspect wouldn't inject the cloned event back into the processing flow early enough to make a difference). There's no way to trigger a specific transform based upon a REGEX match (and the best that could do is run CLONESOURCETYPE=FORMAT1).
So, really, is one TCP port per sourcetype the only way to get this to work?
Even the SDKs don't seem to be able to intercept events before they are indexed.
Guess I'm going to get the award for most unwieldy app ever - 100+ tcp port definitions π π π
... View more