I am going to be forwarding CSV and TSV files, and was wondering if I need to configure both INDEXED_EXTRACTIONS and FIELD_DELIMITER in props.conf for the sourcetype on the Universal Forwarder.
It seems redundant to tell it
INDEXED_EXTRACTIONS= csv and FIELD_DELIMITER= ,
and
INDEXED_EXTRACTIONS= tsv and FIELD_DELIMITER= \t
If it is a csv it should be obvious the field delimiter is a comma.
And if it is a tsv it should be obvious the field delimiter is a tab.
Is there a reason to configure both? Or if only one is needed is there a reason to use one over the other?
When I first got started with INDEXED_EXTRACTIONS
, I was confused, too, and thought as @richgalloway did. It turns out though, that this setting is HIGHLY unique and it causes a Universal Forwarder
to violate the "UFs do not index fields" rule. So you MUST deploy INDEXED_EXTRACTIONS
to the UF and NOT to the Indexers
. You need not set FIELD_DELIMITER
but, so long as you set it to match INDEXED_EXTRACTIONS
, doing so is harmless. Now, why does FIELD_DELIMITER
exist? Because it will override the C
in CSV
when you have a file like %SV
. You will note that %SV
is not an option for INDEXED_EXTRACTIONS
so you use CSV
and then override it by setting FIELD_DELIMITER=%
. It makes perfect sense.
When I first got started with INDEXED_EXTRACTIONS
, I was confused, too, and thought as @richgalloway did. It turns out though, that this setting is HIGHLY unique and it causes a Universal Forwarder
to violate the "UFs do not index fields" rule. So you MUST deploy INDEXED_EXTRACTIONS
to the UF and NOT to the Indexers
. You need not set FIELD_DELIMITER
but, so long as you set it to match INDEXED_EXTRACTIONS
, doing so is harmless. Now, why does FIELD_DELIMITER
exist? Because it will override the C
in CSV
when you have a file like %SV
. You will note that %SV
is not an option for INDEXED_EXTRACTIONS
so you use CSV
and then override it by setting FIELD_DELIMITER=%
. It makes perfect sense.
Thankx. This is certainly a confusing topic. Before last month I would have said that this type of thing had to be done on the indexer, and we have done it that way many times in the past. I just happened to come across the Forwarder configurations. I'm not sure when this functionality got added to the forwarder, but it does help offload some of the work from me / the Splunk infrastructure team and allows server/service admins the capability of adding these types of inputs on their own. Thank you @woodcock and @richgalloway for taking the time to answer and help me with this.
Yes, it is a sneaky way to offload some indexing workload from the Indexers to the Forwarders.
Thanks for straightening me out, woodcock. So what does a UF do with INDEXED_EXTRACTIONS
?
It creates all the indexed fields just like a HF would and sends it to the indexers. A year or 2 ago @martin_mueller or maybe @mus straightened me out, too.
Neither of those settings apply to a Universal Forwarder because the UF does not parse the files.
Set INDEXED_EXTRACTIONS
on your indexers. There is no need to also set FIELD_DELIMITER
.
I think that may have been the case in previous Forwarder versions, but you can now configure this on the Forwarder. "If you have Splunk Enterprise, you can edit the settings on indexer machines or machines where you are running the Splunk universal forwarder." http://docs.splunk.com/Documentation/Splunk/latest/Data/Extractfieldsfromfileswithstructureddata
I already the CSV files being forwarded to Splunk, with props.conf configured on the Forwarder - I tested and verified that last month. It was just today when I went to add the TSV files that I asked myself why I was using both INDEXED_EXTRACTIONS and FIELD_DELIMITER for the CSV files and whether or not I actually needed to.
To tie this completely off: INDEXED_EXTRACTIONS
happens on the forwarder, NOT the Indexer.