<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: CSV files indexing with a second structure (new header) with associated values in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199528#M39490</link>
    <description>&lt;P&gt;Just found this post:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://answers.splunk.com/answers/107021/indexing-data-with-multiple-headers"&gt;http://answers.splunk.com/answers/107021/indexing-data-with-multiple-headers&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;It seems a line breaker could split my csv file as i have a new header like:&lt;/P&gt;

&lt;P&gt;No. time    Device1 Device2 ...&lt;/P&gt;

&lt;P&gt;Trie adding this in data preview:&lt;/P&gt;

&lt;P&gt;LINE_BREAKER = ([\r\n]+)"No."&lt;/P&gt;

&lt;P&gt;No sucess yet...&lt;/P&gt;</description>
    <pubDate>Tue, 17 Jun 2014 15:46:26 GMT</pubDate>
    <dc:creator>guilmxm</dc:creator>
    <dc:date>2014-06-17T15:46:26Z</dc:date>
    <item>
      <title>CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199527#M39489</link>
      <description>&lt;P&gt;Hi !&lt;/P&gt;

&lt;P&gt;Currently working for a quite complex Application, i am indexing many csv files contains within Zip files.&lt;/P&gt;

&lt;P&gt;This data has the following tabular format:&lt;/P&gt;

&lt;P&gt;timestamp,device1,device2,device3...&lt;BR /&gt;
timestamp,value1,value2,value3...&lt;/P&gt;

&lt;P&gt;And so on, up to 128 columns.&lt;/P&gt;

&lt;P&gt;Everything was working perfectly, with a configuration as:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;props.conf&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;[hds_perf]&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;# your settings
INDEXED_EXTRACTIONS=csv
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=false

# set by detected source type
KV_MODE=none
pulldown_type=true

# Time zone of HDS data is UTC/GMT
TZ=UTC
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In limits.conf, i had to set the kv limit to allow more than 50 columns to be indexed:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[kv]
# when non-zero, the point at which kv should stop creating new columns
maxcols  = 512
# maximum number of keys auto kv can generate
limit    = 256
# truncate _raw to to this size and then do auto KV
maxchars = 10240
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;BUT... i lately discovered that the manufactor extracting tool (this is big data coming from storage Array) split a csv file (mostly for some like devices) in 2 part within the same file.&lt;/P&gt;

&lt;P&gt;In exactly line "1448" of every files concerned, a new header is written containing the rest of devices between 129 and 256 (256 is the max technical number of device per unit)&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Splunk can't natively work with that, as mentioned in Docs:&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.1.1/Data/Extractfieldsfromfileheadersatindextime"&gt;http://docs.splunk.com/Documentation/Splunk/6.1.1/Data/Extractfieldsfromfileheadersatindextime&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;And specially:&lt;/STRONG&gt;&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;Splunk Enterprise does not support&lt;BR /&gt;
renaming of header fields mid-file&lt;BR /&gt;
Some software, such as Internet&lt;BR /&gt;
Information Server, supports the&lt;BR /&gt;
renaming of header fields in the&lt;BR /&gt;
middle of the file. Splunk does not&lt;BR /&gt;
recognize changes such as this. If you&lt;BR /&gt;
attempt to index a file which has&lt;BR /&gt;
header fields renamed within the file,&lt;BR /&gt;
Splunk does not index the renamed&lt;BR /&gt;
header field.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Off course, i understand and the message is clear enough, but i keep hope that some advanced technique like redirecting some part of the file to null queue, and some other not, or some technique to simulate having 2 source type for the same file could be possible&lt;/P&gt;

&lt;P&gt;Or perhaps some regex stuff, i don't know yet...&lt;/P&gt;

&lt;P&gt;I anyone would have some idea on how this could be managed, i'm sure this would be an interesting case for others &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Thanks in advance for any help and answer!&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jun 2014 15:09:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199527#M39489</guid>
      <dc:creator>guilmxm</dc:creator>
      <dc:date>2014-06-17T15:09:16Z</dc:date>
    </item>
    <item>
      <title>Re: CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199528#M39490</link>
      <description>&lt;P&gt;Just found this post:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://answers.splunk.com/answers/107021/indexing-data-with-multiple-headers"&gt;http://answers.splunk.com/answers/107021/indexing-data-with-multiple-headers&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;It seems a line breaker could split my csv file as i have a new header like:&lt;/P&gt;

&lt;P&gt;No. time    Device1 Device2 ...&lt;/P&gt;

&lt;P&gt;Trie adding this in data preview:&lt;/P&gt;

&lt;P&gt;LINE_BREAKER = ([\r\n]+)"No."&lt;/P&gt;

&lt;P&gt;No sucess yet...&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jun 2014 15:46:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199528#M39490</guid>
      <dc:creator>guilmxm</dc:creator>
      <dc:date>2014-06-17T15:46:26Z</dc:date>
    </item>
    <item>
      <title>Re: CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199529#M39491</link>
      <description>&lt;P&gt;My raw data header is as follows:&lt;/P&gt;

&lt;P&gt;"No.","time",...&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jun 2014 15:59:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199529#M39491</guid>
      <dc:creator>guilmxm</dc:creator>
      <dc:date>2014-06-17T15:59:14Z</dc:date>
    </item>
    <item>
      <title>Re: CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199530#M39492</link>
      <description>&lt;P&gt;Cannot be natively managed by Splunk, and requires a third party script to pre-process the data&lt;/P&gt;</description>
      <pubDate>Wed, 25 Feb 2015 20:54:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199530#M39492</guid>
      <dc:creator>guilmxm</dc:creator>
      <dc:date>2015-02-25T20:54:54Z</dc:date>
    </item>
    <item>
      <title>Re: CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199531#M39493</link>
      <description>&lt;P&gt;You can use a LINE_BREAKER to break the events, like this&lt;/P&gt;

&lt;P&gt;Props.conf&lt;BR /&gt;
[sourcetypeName]&lt;BR /&gt;
LINE_BREAKER=([\n\r]+)regexThatMarches2ndHeaderHere&lt;BR /&gt;
TRANSFORMS-aaa=transform1,transform2&lt;/P&gt;

&lt;P&gt;transforms.conf&lt;BR /&gt;
[transform1]&lt;BR /&gt;
REGEX=regexToExtract128FieldsinData1&lt;/P&gt;

&lt;P&gt;[transform2]&lt;BR /&gt;
REGEX=regexToExtractFieldaInData2&lt;/P&gt;</description>
      <pubDate>Fri, 27 Oct 2017 01:16:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199531#M39493</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-10-27T01:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: CSV files indexing with a second structure (new header) with associated values</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199532#M39494</link>
      <description>&lt;P&gt;Found this answer while looking for something else and I disagree that this can’t be handled by splunk.  See my answer for more details.&lt;/P&gt;

&lt;P&gt;Just note with large csv files you may also have to tweak limits.conf [kv] stanza values too get all the fields to display in search.&lt;/P&gt;</description>
      <pubDate>Mon, 30 Oct 2017 11:37:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/CSV-files-indexing-with-a-second-structure-new-header-with/m-p/199532#M39494</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-10-30T11:37:58Z</dc:date>
    </item>
  </channel>
</rss>

