All Apps and Add-ons

Splunk Analytics for Hadoop: any workarounds for FIELD_HEADER_REGEX

SplunkTrust
SplunkTrust

I am using Splunk Analytics for Hadoop/Hunk to search small csv files with a header in HDFS. So the first line of each CSV is the header explaining the order of the columns.

columnheader1,columnheader2,columheader3
data,blah,foo
moredata,bleah,foo2

When I tried using FIELD_HEADER_REGEX to exclude the header, I found that, indeed, FIELD_HEADER_REGEX and other similar props.conf directives do not work in virtual indexes.

https://docs.splunk.com/Documentation/Splunk/7.0.0/HadoopAnalytics/Headerextractions

I can search with

index=myvirtualindex NOT "columnheader1,columnheader2,columnheader3"

Are there any other workarounds?

0 Karma
1 Solution

Splunk Employee
Splunk Employee

By default, Splunk record reader com.splunk.mr.input.SimpleCSVRecordReader should be able to process CSV files.
This record reader will display the CSV file as JSON in the UI.
I assume that in your case, the record reader has an issue with the first line.

View solution in original post

0 Karma

Splunk Employee
Splunk Employee

By default, Splunk record reader com.splunk.mr.input.SimpleCSVRecordReader should be able to process CSV files.
This record reader will display the CSV file as JSON in the UI.
I assume that in your case, the record reader has an issue with the first line.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

Hey Raanan! Oh I see in the stanza for the virtual index I had both

vix.input.1.recordreader             = com.splunk.mr.input.SimpleCSVRecordReader
vix.input.1.recordreader             = SplunkLineRecordReader

So I removed the second one and everything now works perfectly. I don't see the header. Thanks!

0 Karma