We are using HEC to ingest logs from a cloud platform.
Environment details : HEC running on a windows instance of Splunk 7.0.3
The sourcetype A is sent in the event payload which is over-riding the sourcetype set in per token stanza.
In order to over-ride it to B, we use props.conf and transforms.conf as below.
TRANSFORMS-sourcetype = transformname
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::B
This works fine in renaming sourcetype and timestamp assignment for B as expected.
What I cannot comprehend is when I search for raw events using index= .. I see equal count of events for A and B sourcetypes. Where it gets weirder is when I do stats count by sourcetype, I see count returns only for B.
Its as though A exists in the original raw data search but does not exist at the same time.
index= sourcetype=A does not return events. when I search index= sourcetype=B, both appears.
Can you please help on how I go about fixing this?
Firstly: Applying indextime settings like timestamping and linebreaking on a sourcetype that is set using a TRANSFORMS does not work. You're probably seeing Splunk's automagic linebreaking and timestamping at work. You always need to set those configurations for the original sourcetype.
Secondly: since the sourcetype is included in the json data, that will get extracted again at searchtime, because you have
KV_MODE=json. Not 100% sure why you get that inconsistent behavior (probably because of the TRANSFORMS that changes the indexed sourcetype value), but I would suggest changing that
KV_MODE=none. You already have the json fields extracted using
INDEXED_EXTRACTIONS=json, extracting them again at searchtime using
KV_MODE=json will lead to duplicate extractions incl. extracting the original sourcetype value from the json data.
Hi @FrankVl thanks for the response.
I have previously attempted with KV_MODE=none as well to no avail. It still seems to exhibit this behaviour.
As for the timestamping, that was my understanding as well but it does work well with the stanza I posted earlier. Without the time prefix and format, the time is erroneous but with it, it works great.
I just confirm with some additional testing that this behaviour of displaying both A and B seems to be for real time searches. Historic searches work just fine.