Archive
Highlighted

"ghost" sourcetype during sourcetype override

Communicator

We are using HEC to ingest logs from a cloud platform.

Environment details : HEC running on a windows instance of Splunk 7.0.3

The sourcetype A is sent in the event payload which is over-riding the sourcetype set in per token stanza.

In order to over-ride it to B, we use props.conf and transforms.conf as below.

Props.conf

[A]
TRANSFORMS-sourcetype = transformname

[B]
CHARSET=UTF-8
INDEXEDEXTRACTIONS=json
KV
MODE=json
SHOULDLINEMERGE=false
category=Structured
description=JavaScript Object Notation format. For more information, visit http://json.org/
disabled=false
pulldown
type=true
TIMEFORMAT=timeformat
LINE
BREAKER=([\r\n]+)
TIME_PREFIX=timeprefix

transforms.conf

[transformname]
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::B

This works fine in renaming sourcetype and timestamp assignment for B as expected.

What I cannot comprehend is when I search for raw events using index= .. I see equal count of events for A and B sourcetypes. Where it gets weirder is when I do stats count by sourcetype, I see count returns only for B.

Its as though A exists in the original raw data search but does not exist at the same time.

index= sourcetype=A does not return events. when I search index= sourcetype=B, both appears.

Can you please help on how I go about fixing this?

Tags (1)
0 Karma
Highlighted

Re: "ghost" sourcetype during sourcetype override

Ultra Champion

Firstly: Applying indextime settings like timestamping and linebreaking on a sourcetype that is set using a TRANSFORMS does not work. You're probably seeing Splunk's automagic linebreaking and timestamping at work. You always need to set those configurations for the original sourcetype.

Secondly: since the sourcetype is included in the json data, that will get extracted again at searchtime, because you have KV_MODE=json. Not 100% sure why you get that inconsistent behavior (probably because of the TRANSFORMS that changes the indexed sourcetype value), but I would suggest changing that KV_MODE=json to KV_MODE=none. You already have the json fields extracted using INDEXED_EXTRACTIONS=json, extracting them again at searchtime using KV_MODE=json will lead to duplicate extractions incl. extracting the original sourcetype value from the json data.

View solution in original post

0 Karma
Highlighted

Re: "ghost" sourcetype during sourcetype override

Communicator

Hi @FrankVl thanks for the response.

I have previously attempted with KV_MODE=none as well to no avail. It still seems to exhibit this behaviour.

As for the timestamping, that was my understanding as well but it does work well with the stanza I posted earlier. Without the time prefix and format, the time is erroneous but with it, it works great.

I just confirm with some additional testing that this behaviour of displaying both A and B seems to be for real time searches. Historic searches work just fine.

0 Karma
Highlighted

Re: "ghost" sourcetype during sourcetype override

Communicator

The issue seems to have been ephemeral nonetheless weird.

0 Karma