Getting Data In

Why is the sourcetype specified in inputs.conf on the universal forwarder not being applied to forwarded data?

lyndac
Contributor

I am using version 6.2.1. I have set up a forwarder to monitor a directory and forward the files to my indexer. The files are being forwarded (so I know the communication is ok), but the sourcetype specified in inputs.conf on the forwarder is not being applied. My test environment: 1 indexer, 1 forwarder and 1 search head. I am not using deployment server on my test environment.

The contents of each file to be forwarded looks like this:
input_time, filename,host,filesize,
2015-07-08 08:12:34,file1.txt,myhost,33421
2015-07-08 08:13:34,file2.txt,anotherhost, 43465

I have set up the sourcetype (csv_br) and index on the indexer. Here are the contents of the relavent conf files.

forwarder: inputs.conf

[monitor:///opt2/data/forward/br/*.csv]
index=br
sourcetype = csv_br

indexer: props.conf

[csv_br]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
SHOULD_LINEMERGE= false
TIME_PREFIX=^
MAX_TIMESTAMP_LOOKAHEAD = 20
TIMESTAMP_FIELDS = input_time
TIME_FORMAT = %Y-%m-%d %H:%M:%S
1 Solution

acharlieh
Influencer

You're using INDEXED_EXTRACTIONS so you should duplicate the props.conf configuration to the Universal Forwarder (with appropriate restarts of course). With INDEXED_EXTRACTIONS, the Universal Forwarder actually takes on more of the parsing tasks, by routing events through the structuredParsing pipeline. There's a good diagram on this here: http://wiki.splunk.com/Community:HowIndexingWorks

Part of the reason for this, is that to properly parse new events added to the file in files with headers (such as CSV, W3C) under INDEXED_EXTRACTIONS, you would need the header row so you know what the fields are named so the fields could be written at index time. If you were doing this parsing on the Indexer/Heavy Forwarder, you'd have trouble since the header row would only be sent when you start indexing a new file. (If the UF was doing it's typical forward only the parts that haven't been seen yet behavior)

This includes the ability to nullQueue on the Universal Forwarder. As a side effect however, this means the indexer doesn't do much (at all) with this data other than write the events to disk as they come fully parsed from the UF.

A related answer: http://answers.splunk.com/answers/118668/filter-iis-logs-before-indexing.html#answer-119031

View solution in original post

acharlieh
Influencer

You're using INDEXED_EXTRACTIONS so you should duplicate the props.conf configuration to the Universal Forwarder (with appropriate restarts of course). With INDEXED_EXTRACTIONS, the Universal Forwarder actually takes on more of the parsing tasks, by routing events through the structuredParsing pipeline. There's a good diagram on this here: http://wiki.splunk.com/Community:HowIndexingWorks

Part of the reason for this, is that to properly parse new events added to the file in files with headers (such as CSV, W3C) under INDEXED_EXTRACTIONS, you would need the header row so you know what the fields are named so the fields could be written at index time. If you were doing this parsing on the Indexer/Heavy Forwarder, you'd have trouble since the header row would only be sent when you start indexing a new file. (If the UF was doing it's typical forward only the parts that haven't been seen yet behavior)

This includes the ability to nullQueue on the Universal Forwarder. As a side effect however, this means the indexer doesn't do much (at all) with this data other than write the events to disk as they come fully parsed from the UF.

A related answer: http://answers.splunk.com/answers/118668/filter-iis-logs-before-indexing.html#answer-119031

View solution in original post

woodcock
Esteemed Legend

If you have changed the setting in either place, you need to restart the Splunk instance on that server like this:

$SPLUNK_HOME/bin/splunk restart
0 Karma

lyndac
Contributor

I stopped splunk on both indexer and forwarder. Cleaned the index on the indexer and then started splunk on both indexer and forwarder. Sam problem happens.
I've also done splunk btool check and splunk btool props list --debug and everything looks ok to me. If it will help, I will post the output of btool, but I am on a closed environment, so that is difficult to do.

0 Karma

woodcock
Esteemed Legend

What directory are you using? If you are using a default directory, go up one directory and down into local to see if there is a competing inputs.conf file that has a different sourcetype.

One more thing: what do you mean by "wrong sourcetype applied"? This could mean either:

1: my events have the wrong sourcetype (e.g. sourcetype!="csv_br")
2: my events lack configurations that should be applied to it based on the sourctype (e.g. sourcetype="csv_br" but props.conf for "csv_br" is not being applied to my events).

0 Karma

lyndac
Contributor

Do I need to have the props.conf file on the forwarder? I currently only have it on the indexer.

0 Karma

lyndac
Contributor

By "wrong sourcetype applied", I mean #2.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.