<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can I skip specific lines while indexing data? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347589#M63809</link>
    <description>&lt;P&gt;Thanks, I'll take a look.  One doubt: will this allow me to read the first line as the headers and only ignore the second and third lines?&lt;/P&gt;</description>
    <pubDate>Tue, 07 Mar 2017 15:53:18 GMT</pubDate>
    <dc:creator>andrewtrobec</dc:creator>
    <dc:date>2017-03-07T15:53:18Z</dc:date>
    <item>
      <title>Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347587#M63807</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am trying to index a csv log file that looks like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Description,NumJobWaitEvents,ReturnCode,RunEnd,RunStart,ScheduledStartTime,Status
Job.Description,Job.NumJobWaitEvents,Job.ReturnCode,Job.RunEnd,Job.RunStart,Job.ScheduledStartTime,Job.Status
String,Integer,Integer,DateTime,DateTime,DateTime,enum.JobStatus
Auto Start,0,null,"2017/03/05 06:03:39,441","2017/03/05 06:01:39,269","2017/03/05 06:01:39,065",Completed
Auto Start,0,null,"2017/03/05 06:09:04,493","2017/03/05 06:06:23,915","2017/03/05 06:06:23,743",Completed
AG43_542_TINA_CODE_AGB - Checking,1,null,"2017/03/05 06:32:18,908","2017/03/05 06:23:15,148","2017/03/05 06:23:14,822",Completed
DATA SANITY CHECK,0,null,"2017/03/05 09:02:23,997","2017/03/05 09:00:44,073","2017/03/05 09:00:42,959",Completed
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The first line always contains the header, the second and third lines always contain object and type information, and the log data always starts from the fourth line.&lt;/P&gt;

&lt;P&gt;When I index the file as it is, it only indexes the first two lines even though there are thousands.  My question is: how can I skip the second and third lines so I can index the actual log data?&lt;/P&gt;

&lt;P&gt;Thank you and best regards,&lt;/P&gt;

&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 15:27:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347587#M63807</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2017-03-07T15:27:52Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347588#M63808</link>
      <description>&lt;P&gt;Check out:   &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Propsconf"&gt;http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Propsconf&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;The section:  Structured Data Header Extraction and configuration&lt;/P&gt;

&lt;P&gt;PREAMBLE_REGEX = &lt;BR /&gt;
* Some files contain preamble lines. This attribute specifies a regular&lt;BR /&gt;
  expression which allows Splunk to ignore these preamble lines, based on&lt;BR /&gt;
  the pattern specified.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 15:45:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347588#M63808</guid>
      <dc:creator>dmaislin_splunk</dc:creator>
      <dc:date>2017-03-07T15:45:16Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347589#M63809</link>
      <description>&lt;P&gt;Thanks, I'll take a look.  One doubt: will this allow me to read the first line as the headers and only ignore the second and third lines?&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 15:53:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347589#M63809</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2017-03-07T15:53:18Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347590#M63810</link>
      <description>&lt;P&gt;Yes, exactly.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2017 18:50:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347590#M63810</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-03-07T18:50:22Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347591#M63811</link>
      <description>&lt;P&gt;Thanks @woodcock&lt;/P&gt;

&lt;P&gt;I've been experimenting but I can't get it to work.  I've added &lt;CODE&gt;PREAMBLE_REGEX = ^Job\.Description.*|String.*&lt;/CODE&gt; (which works on &lt;A href="https://regex101.com/"&gt;https://regex101.com/&lt;/A&gt;) and &lt;CODE&gt;HEADER_FIELD_LINE_NUMBER = 1&lt;/CODE&gt; but it doesn't seem to be working.  I am performing a manual import, selecting &lt;CODE&gt;MY_SOURCETYPE&lt;/CODE&gt; which is defined in my &lt;CODE&gt;props.conf&lt;/CODE&gt; as follows:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[MY_SOURCETYPE]
AUTO_KV_JSON = 1
DATETIME_CONFIG = 
FIELD_DELIMITER = ,
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
PREAMBLE_REGEX = ^Job\.Description.*|String.*
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = RunStart
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Are there any other configurations that I should be aware of?&lt;/P&gt;

&lt;P&gt;Best regards,&lt;/P&gt;

&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 10:58:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347591#M63811</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2017-03-08T10:58:42Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347592#M63812</link>
      <description>&lt;P&gt;Try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[MY_SOURCETYPE]
FIELD_DELIMITER = ,
HEADER_FIELD_LINE_NUMBER = 1
INDEXED_EXTRACTIONS = CSV
PREAMBLE_REGEX = (^|[\r\n])(Job\.Description[^\r\n]+|String[^\r\n]+)
TIMESTAMP_FIELDS = RunStart
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Also, IMHO, events that are "durationful" (i.e. contain &lt;CODE&gt;start&lt;/CODE&gt; and &lt;CODE&gt;end&lt;/CODE&gt; time details) should &lt;EM&gt;always&lt;/EM&gt; use the &lt;CODE&gt;end&lt;/CODE&gt; time as the &lt;CODE&gt;timestamp&lt;/CODE&gt;.  For just one reason, think about what your timechart would look like if your system crashed and all events ended at the same time.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 18:14:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347592#M63812</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-03-08T18:14:59Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347593#M63813</link>
      <description>&lt;P&gt;Thanks for the suggestion, but unfortunately it doesn't work.  I think I see what you're getting at, though: you're trying to create one expression that covers both lines, right?  I'm not too proficient with regexs.&lt;/P&gt;

&lt;P&gt;I'll keep playing with it, thanks!&lt;/P&gt;

&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 18:53:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347593#M63813</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2017-03-08T18:53:29Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347594#M63814</link>
      <description>&lt;P&gt;Yes, and make it flexible enough to work if presented the entire event or just a single line.  That really should have done it.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2017 22:11:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347594#M63814</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-03-08T22:11:53Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347595#M63815</link>
      <description>&lt;P&gt;Just to make sure that I'm following the right procedure I'm going to list out the steps I've followed:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;&lt;P&gt;Edit &lt;CODE&gt;props.conf&lt;/CODE&gt; located in SPLUNKHOME\etc\apps\MY_APP\local to contain&lt;/P&gt;

&lt;P&gt;[MY_SOURCETYPE]&lt;BR /&gt;
FIELD_DELIMITER = ,&lt;BR /&gt;
HEADER_FIELD_LINE_NUMBER = 1&lt;BR /&gt;
INDEXED_EXTRACTIONS = csv&lt;BR /&gt;
PREAMBLE_REGEX = (^|[\r\n])(Job.Description[^\r\n]+|String[^\r\n]+)&lt;BR /&gt;
TIMESTAMP_FIELDS = RunStart&lt;BR /&gt;
category = Structured&lt;BR /&gt;
description = Comma-separated value format. Set header and other settings in "Delimited Settings"&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Restart splunkd via cmd: net stop splunkd/net start splunkd&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Once up, log into Splunk (6.5.2 btw) and enter &lt;CODE&gt;my app&lt;/CODE&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;From the &lt;CODE&gt;Settings&lt;/CODE&gt; menu, select &lt;CODE&gt;Add Data&lt;/CODE&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Select &lt;CODE&gt;upload&lt;/CODE&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Select the csv that contains the data above&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Select &lt;CODE&gt;Next&lt;/CODE&gt;&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;From the Source type list, select MY_SOURCETYPE&lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;At this point, the first two lines of the event list are as follows&lt;/P&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;&lt;IMG src="http://i.imgur.com/uApQQ07.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;P&gt;If the regex works as planned, would I see those two lines at that point?&lt;/P&gt;

&lt;P&gt;Best regards,&lt;/P&gt;

&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 13:09:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347595#M63815</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2020-09-29T13:09:15Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347596#M63816</link>
      <description>&lt;P&gt;If everything is working, you should not see those lines.  HOWEVER, I have never used the &lt;CODE&gt;Add Data&lt;/CODE&gt; wizard with &lt;CODE&gt;INDEXED_EXTRACTIONS&lt;/CODE&gt; before.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Mar 2017 16:23:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347596#M63816</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-03-09T16:23:17Z</dc:date>
    </item>
    <item>
      <title>Re: Can I skip specific lines while indexing data?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347597#M63817</link>
      <description>&lt;P&gt;Hi Andrew,&lt;/P&gt;

&lt;P&gt;many greetings. We were colleagues and shared Splunk informations a lot. I have very similar problem as you have described. Did you solve your problem in the mean time?&lt;BR /&gt;
I wish you all the best.&lt;BR /&gt;
Michal Spisiak&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2019 06:28:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Can-I-skip-specific-lines-while-indexing-data/m-p/347597#M63817</guid>
      <dc:creator>spisiakmi</dc:creator>
      <dc:date>2019-09-10T06:28:37Z</dc:date>
    </item>
  </channel>
</rss>

