Solved: CSV and TSV File Inputs on Universal Forwarder - D...

gn694 · ‎04-05-2017

I am going to be forwarding CSV and TSV files, and was wondering if I need to configure both INDEXED_EXTRACTIONS and FIELD_DELIMITER in props.conf for the sourcetype on the Universal Forwarder.

It seems redundant to tell it
INDEXED_EXTRACTIONS= csv and FIELD_DELIMITER= ,
and
INDEXED_EXTRACTIONS= tsv and FIELD_DELIMITER= \t

If it is a csv it should be obvious the field delimiter is a comma.
And if it is a tsv it should be obvious the field delimiter is a tab.

Is there a reason to configure both? Or if only one is needed is there a reason to use one over the other?

woodcock · ‎04-05-2017

When I first got started with INDEXED_EXTRACTIONS, I was confused, too, and thought as @richgalloway did. It turns out though, that this setting is HIGHLY unique and it causes a Universal Forwarder to violate the "UFs do not index fields" rule. So you MUST deploy INDEXED_EXTRACTIONS to the UF and NOT to the Indexers. You need not set FIELD_DELIMITER but, so long as you set it to match INDEXED_EXTRACTIONS, doing so is harmless. Now, why does FIELD_DELIMITER exist? Because it will override the C in CSV when you have a file like %SV. You will note that %SV is not an option for INDEXED_EXTRACTIONS so you use CSV and then override it by setting FIELD_DELIMITER=%. It makes perfect sense.

View solution in original post

woodcock · ‎04-05-2017

When I first got started with INDEXED_EXTRACTIONS, I was confused, too, and thought as @richgalloway did. It turns out though, that this setting is HIGHLY unique and it causes a Universal Forwarder to violate the "UFs do not index fields" rule. So you MUST deploy INDEXED_EXTRACTIONS to the UF and NOT to the Indexers. You need not set FIELD_DELIMITER but, so long as you set it to match INDEXED_EXTRACTIONS, doing so is harmless. Now, why does FIELD_DELIMITER exist? Because it will override the C in CSV when you have a file like %SV. You will note that %SV is not an option for INDEXED_EXTRACTIONS so you use CSV and then override it by setting FIELD_DELIMITER=%. It makes perfect sense.

gn694 · ‎04-05-2017

Thankx. This is certainly a confusing topic. Before last month I would have said that this type of thing had to be done on the indexer, and we have done it that way many times in the past. I just happened to come across the Forwarder configurations. I'm not sure when this functionality got added to the forwarder, but it does help offload some of the work from me / the Splunk infrastructure team and allows server/service admins the capability of adding these types of inputs on their own. Thank you @woodcock and @richgalloway for taking the time to answer and help me with this.

woodcock · ‎04-05-2017

Yes, it is a sneaky way to offload some indexing workload from the Indexers to the Forwarders.

richgalloway · ‎04-05-2017

Thanks for straightening me out, woodcock. So what does a UF do with INDEXED_EXTRACTIONS?

---
If this reply helps you, Karma would be appreciated.

woodcock · ‎04-05-2017

It creates all the indexed fields just like a HF would and sends it to the indexers. A year or 2 ago @martin_mueller or maybe @mus straightened me out, too.

richgalloway · ‎04-05-2017

Neither of those settings apply to a Universal Forwarder because the UF does not parse the files.
Set INDEXED_EXTRACTIONS on your indexers. There is no need to also set FIELD_DELIMITER.

---
If this reply helps you, Karma would be appreciated.

gn694 · ‎04-05-2017

I think that may have been the case in previous Forwarder versions, but you can now configure this on the Forwarder. "If you have Splunk Enterprise, you can edit the settings on indexer machines or machines where you are running the Splunk universal forwarder." http://docs.splunk.com/Documentation/Splunk/latest/Data/Extractfieldsfromfileswithstructureddata

I already the CSV files being forwarded to Splunk, with props.conf configured on the Forwarder - I tested and verified that last month. It was just today when I went to add the TSV files that I asked myself why I was using both INDEXED_EXTRACTIONS and FIELD_DELIMITER for the CSV files and whether or not I actually needed to.

woodcock · ‎07-19-2017

To tie this completely off: INDEXED_EXTRACTIONS happens on the forwarder, NOT the Indexer.

CSV and TSV File Inputs on Universal Forwarder - Do I need to configure both INDEXED_EXTRACTIONS and FIELD_DELIMITER?

Buttercup Games Tutorial Extension - part 9

Buttercup Games Tutorial Extension - part 8

Introducing the Splunk Developer Program!

Are you a member of the Splunk Community?

CSV and TSV File Inputs on Universal Forwarder - Do I need to configure both INDEXED_EXTRACTIONS and FIELD_DELIMITER?

Buttercup Games Tutorial Extension - part 9

Buttercup Games Tutorial Extension - part 8

Introducing the Splunk Developer Program!