Getting Data In

CSV files indexing with a second structure (new header) with associated values


Hi !

Currently working for a quite complex Application, i am indexing many csv files contains within Zip files.

This data has the following tabular format:


And so on, up to 128 columns.

Everything was working perfectly, with a configuration as:



# your settings

# set by detected source type

# Time zone of HDS data is UTC/GMT

In limits.conf, i had to set the kv limit to allow more than 50 columns to be indexed:

# when non-zero, the point at which kv should stop creating new columns
maxcols  = 512
# maximum number of keys auto kv can generate
limit    = 256
# truncate _raw to to this size and then do auto KV
maxchars = 10240

BUT... i lately discovered that the manufactor extracting tool (this is big data coming from storage Array) split a csv file (mostly for some like devices) in 2 part within the same file.

In exactly line "1448" of every files concerned, a new header is written containing the rest of devices between 129 and 256 (256 is the max technical number of device per unit)

Splunk can't natively work with that, as mentioned in Docs:

And specially:

Splunk Enterprise does not support
renaming of header fields mid-file
Some software, such as Internet
Information Server, supports the
renaming of header fields in the
middle of the file. Splunk does not
recognize changes such as this. If you
attempt to index a file which has
header fields renamed within the file,
Splunk does not index the renamed
header field.

Off course, i understand and the message is clear enough, but i keep hope that some advanced technique like redirecting some part of the file to null queue, and some other not, or some technique to simulate having 2 source type for the same file could be possible

Or perhaps some regex stuff, i don't know yet...

I anyone would have some idea on how this could be managed, i'm sure this would be an interesting case for others 🙂

Thanks in advance for any help and answer!

0 Karma
1 Solution


Cannot be natively managed by Splunk, and requires a third party script to pre-process the data

View solution in original post

0 Karma


You can use a LINE_BREAKER to break the events, like this




0 Karma


Cannot be natively managed by Splunk, and requires a third party script to pre-process the data

0 Karma


Found this answer while looking for something else and I disagree that this can’t be handled by splunk. See my answer for more details.

Just note with large csv files you may also have to tweak limits.conf [kv] stanza values too get all the fields to display in search.

0 Karma


My raw data header is as follows:


0 Karma


Just found this post:

It seems a line breaker could split my csv file as i have a new header like:

No. time Device1 Device2 ...

Trie adding this in data preview:

LINE_BREAKER = ([\r\n]+)"No."

No sucess yet...

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...