Getting Data In

CSV files indexing with a second structure (new header) with associated values


Hi !

Currently working for a quite complex Application, i am indexing many csv files contains within Zip files.

This data has the following tabular format:


And so on, up to 128 columns.

Everything was working perfectly, with a configuration as:



# your settings

# set by detected source type

# Time zone of HDS data is UTC/GMT

In limits.conf, i had to set the kv limit to allow more than 50 columns to be indexed:

# when non-zero, the point at which kv should stop creating new columns
maxcols  = 512
# maximum number of keys auto kv can generate
limit    = 256
# truncate _raw to to this size and then do auto KV
maxchars = 10240

BUT... i lately discovered that the manufactor extracting tool (this is big data coming from storage Array) split a csv file (mostly for some like devices) in 2 part within the same file.

In exactly line "1448" of every files concerned, a new header is written containing the rest of devices between 129 and 256 (256 is the max technical number of device per unit)

Splunk can't natively work with that, as mentioned in Docs:

And specially:

Splunk Enterprise does not support
renaming of header fields mid-file
Some software, such as Internet
Information Server, supports the
renaming of header fields in the
middle of the file. Splunk does not
recognize changes such as this. If you
attempt to index a file which has
header fields renamed within the file,
Splunk does not index the renamed
header field.

Off course, i understand and the message is clear enough, but i keep hope that some advanced technique like redirecting some part of the file to null queue, and some other not, or some technique to simulate having 2 source type for the same file could be possible

Or perhaps some regex stuff, i don't know yet...

I anyone would have some idea on how this could be managed, i'm sure this would be an interesting case for others 🙂

Thanks in advance for any help and answer!

0 Karma
1 Solution


Cannot be natively managed by Splunk, and requires a third party script to pre-process the data

View solution in original post

0 Karma


You can use a LINE_BREAKER to break the events, like this




0 Karma


Cannot be natively managed by Splunk, and requires a third party script to pre-process the data

0 Karma


Found this answer while looking for something else and I disagree that this can’t be handled by splunk. See my answer for more details.

Just note with large csv files you may also have to tweak limits.conf [kv] stanza values too get all the fields to display in search.

0 Karma


My raw data header is as follows:


0 Karma


Just found this post:

It seems a line breaker could split my csv file as i have a new header like:

No. time Device1 Device2 ...

Trie adding this in data preview:

LINE_BREAKER = ([\r\n]+)"No."

No sucess yet...

0 Karma
Get Updates on the Splunk Community!

Dashboard Studio Challenge - Learn New Tricks, Showcase Your Skills, and Win Prizes!

Reimagine what you can do with your dashboards. Dashboard Studio is Splunk’s newest dashboard builder to ...

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...