Solved: Logs ingested via sinkhole are not receiving the a...

castle1126 · ‎11-01-2010

Hi,

I have a Linux system running 4.1.2 of Splunk. On that system I'm trying to use an input (set up in the LOCAL/INPUTS.CONF file) to grab data from a sinkhole directory and set the sourcetype accordingly. The snippet of the inputs.conf is here:

[batch:///tmp/proxy1] move_policy = sinkhole index = proxy sourcetype = myweb crcSalt =

I also have a LOCAL/PROPS.CONF that has the following information for the sourcetype: [myweb] KV_MODE = none SHOULD_LINEMERGE = False REPORT-myweb_csv = myweb_csv TRANSFORMS-nullheader = nullheader

I do see the entries come into the index named "proxy", but the sourcetype does not say myweb, it says "myweb-2". And now I see an entry in the PROPS.CONF file located in $SPLUNK_HOME/etc/apps/learned/local that has the "myweb-2" sourcetype included there.

What piece am I missing to allow the sink-holed data to keep the sourcetype I set in the INPUTS.CONF file?

jrodman · ‎11-01-2010

This is an artifact of the way splunk supports storing csv header field names. If you do not need the field names handled for you, you can disable the AUTO_HEADER setting in props.conf, for example in your [myweb] stanza.

Unfortunately, in 4.1.2, the implied sourcetype (csv, for .csv files) happens even when an explicit sourcetype is set, as you have done. In 4.1.4 and later, this should not occur, so this artifact should not happen without any configuration changes.

What you lose from this change is that the events will not automatically have named fields. If you always have the same field names in the same order in this data source, you can resolve the issue by creating a DELIMS/FIELDS extraction for your sourcetype.

View solution in original post

jrodman · ‎11-01-2010

This is an artifact of the way splunk supports storing csv header field names. If you do not need the field names handled for you, you can disable the AUTO_HEADER setting in props.conf, for example in your [myweb] stanza.

Unfortunately, in 4.1.2, the implied sourcetype (csv, for .csv files) happens even when an explicit sourcetype is set, as you have done. In 4.1.4 and later, this should not occur, so this artifact should not happen without any configuration changes.

What you lose from this change is that the events will not automatically have named fields. If you always have the same field names in the same order in this data source, you can resolve the issue by creating a DELIMS/FIELDS extraction for your sourcetype.

jrodman · ‎11-02-2010

Yes, unless you were to, for example, disable AUTO_HEADER for the csv sourcetype. The AUTO_HEADER behavior is designed around storing the names of the fields in a sourctype, and 'learned' is the app that came to exist around storing generated sourcetypes for other purposes.

castle1126 · ‎11-01-2010

Thanks for the response. So are you saying that because I'm at 4.1.2 I will continue to have the learned sourcetype be set as the data is read in?

Logs ingested via sinkhole are not receiving the appropriate sourcetype

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes