Getting Data In

Logs ingested via sinkhole are not receiving the appropriate sourcetype

castle1126
Communicator

Hi,

I have a Linux system running 4.1.2 of Splunk. On that system I'm trying to use an input (set up in the LOCAL/INPUTS.CONF file) to grab data from a sinkhole directory and set the sourcetype accordingly. The snippet of the inputs.conf is here:

[batch:///tmp/proxy1] move_policy = sinkhole index = proxy sourcetype = myweb crcSalt =

I also have a LOCAL/PROPS.CONF that has the following information for the sourcetype: [myweb] KV_MODE = none SHOULD_LINEMERGE = False REPORT-myweb_csv = myweb_csv TRANSFORMS-nullheader = nullheader

I do see the entries come into the index named "proxy", but the sourcetype does not say myweb, it says "myweb-2". And now I see an entry in the PROPS.CONF file located in $SPLUNK_HOME/etc/apps/learned/local that has the "myweb-2" sourcetype included there.

What piece am I missing to allow the sink-holed data to keep the sourcetype I set in the INPUTS.CONF file?

Tags (1)
0 Karma
1 Solution

jrodman
Splunk Employee
Splunk Employee

This is an artifact of the way splunk supports storing csv header field names. If you do not need the field names handled for you, you can disable the AUTO_HEADER setting in props.conf, for example in your [myweb] stanza.

Unfortunately, in 4.1.2, the implied sourcetype (csv, for .csv files) happens even when an explicit sourcetype is set, as you have done. In 4.1.4 and later, this should not occur, so this artifact should not happen without any configuration changes.

What you lose from this change is that the events will not automatically have named fields. If you always have the same field names in the same order in this data source, you can resolve the issue by creating a DELIMS/FIELDS extraction for your sourcetype.

View solution in original post

jrodman
Splunk Employee
Splunk Employee

This is an artifact of the way splunk supports storing csv header field names. If you do not need the field names handled for you, you can disable the AUTO_HEADER setting in props.conf, for example in your [myweb] stanza.

Unfortunately, in 4.1.2, the implied sourcetype (csv, for .csv files) happens even when an explicit sourcetype is set, as you have done. In 4.1.4 and later, this should not occur, so this artifact should not happen without any configuration changes.

What you lose from this change is that the events will not automatically have named fields. If you always have the same field names in the same order in this data source, you can resolve the issue by creating a DELIMS/FIELDS extraction for your sourcetype.

jrodman
Splunk Employee
Splunk Employee

Yes, unless you were to, for example, disable AUTO_HEADER for the csv sourcetype. The AUTO_HEADER behavior is designed around storing the names of the fields in a sourctype, and 'learned' is the app that came to exist around storing generated sourcetypes for other purposes.

0 Karma

castle1126
Communicator

Thanks for the response. So are you saying that because I'm at 4.1.2 I will continue to have the learned sourcetype be set as the data is read in?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...