<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to drop log file headers before indexing? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364409#M66366</link>
    <description>&lt;P&gt;The answer was surprisingly simple, my regex was an exact match so that stopped the trick from working, what was required was:&lt;BR /&gt;
FIELD_HEADER_REGEX=\s+16\s+Question\s+N&lt;/P&gt;

&lt;P&gt;This way it matches the line but not the entire line, and then the entire section up to that point appears to have been dropped.&lt;/P&gt;

&lt;P&gt;EDIT: there are a few limitations I've found to this, the forwarder in use started mentioning time parsing and line breaking warnings after this setting was applied so I assume it's attempting to parse the logs before forwarding them to the indexer.&lt;BR /&gt;
Furthermore the forwarder's CPU increased in order to process this setting, therefore I've fallen back to performing the work on the indexer rather than the universal forwarder by not using this trick.&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 14:44:50 GMT</pubDate>
    <dc:creator>gjanders</dc:creator>
    <dc:date>2020-09-29T14:44:50Z</dc:date>
    <item>
      <title>How to drop log file headers before indexing?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364405#M66362</link>
      <description>&lt;P&gt;I've had a read of &lt;A href="https://www.splunk.com/blog/2013/10/22/dropping-useless-headers-in-splunk-6.html" target="_blank"&gt;dropping useless headers in Splunk 6&lt;/A&gt; and tried using the FIELD_HEADER_REGEX, in fact I also tried the HEADER_FIELD_LINE_NUMBER trick but that did not work as expected either.&lt;/P&gt;

&lt;P&gt;The blog post says:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;I stole some of this from the&lt;BR /&gt;
Websphere App but added the&lt;BR /&gt;
FIELD_HEADER_REGEX. This tells Splunk&lt;BR /&gt;
to look for that last line of the&lt;BR /&gt;
header from above:&lt;BR /&gt;
************* End Display Current Environment *************&lt;BR /&gt;
And start indexing events after that.&lt;BR /&gt;
You could also use&lt;BR /&gt;
HEADER_FIELD_LINE_NUMBER if your data&lt;BR /&gt;
writes a consistent number of header&lt;BR /&gt;
lines.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;The default props.conf is:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[MSAD:NT6:DNS]
CHECK_FOR_HEADER = 0
REPORT_KV_for_microsoft_dns_web = KV_for_port,KV_for_Domain,KV_for_RecvdIP,KV_for_microsoftdns_action,KV_for_Record_type,KV_for_Record_Class
SHOULD_LINEMERGE = false
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;For the props.conf on the universal forwarder I added:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[MSAD:NT6:DNS]
#Drop the header lines from the file
FIELD_HEADER_REGEX=\s+16\s+Question\s+Name
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;That does something, it combines the header so it looks like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Log file wrap at 27/06/2017 2:35:27 PM
Message logging key (for packets - other items use a subset of these fields):
    Field #  Information         Values
    -------  -----------         ------
       1     Date
       2     Time
       3     Thread ID
       4     Context
       5     Internal packet identifier
       6     UDP/TCP indicator
       7     Send/Receive indicator
       8     Remote IP
       9     Xid (hex)
      10     Query/Response      R = Response
                                 blank = Query
      11     Opcode              Q = Standard Query
                                 N = Notify
                                 U = Update
                                 ? = Unknown
      12     [ Flags (hex)
      13     Flags (char codes)  A = Authoritative Answer
                                 T = Truncated Response
                                 D = Recursion Desired
                                 R = Recursion Available
      14     ResponseCode ]
      15     Question Type
      16     Question Name
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Which is an improvement over having 16 random events that relate to the header, but it has not dropped.&lt;/P&gt;

&lt;P&gt;On the indexing tier I tried:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[MSAD:NT6:DNS]
TRANSFORMS-t1 = eliminate-DNSHeaders
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[eliminate-DNSHeaders]
REGEX=(?m)^Log file wrap at
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I also tried without the ?m , clearly I'm missing something but I am not sure how I should drop the header records, they are not useful...&lt;/P&gt;

&lt;P&gt;If someone can let me know how to drop these records, the last setting was placed on the indexers not the universal forwarders.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 14:35:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364405#M66362</guid>
      <dc:creator>gjanders</dc:creator>
      <dc:date>2020-09-29T14:35:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to drop log file headers before indexing?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364406#M66363</link>
      <description>&lt;P&gt;you want to filter out until &lt;BR /&gt;
&lt;CODE&gt;27.       16     Question Name&lt;/CODE&gt;&lt;BR /&gt;
right?   (this contradicts with -  &lt;CODE&gt;REGEX=(?m)^Log file wrap at&lt;/CODE&gt; right)&lt;/P&gt;

&lt;P&gt;maybe, did you try - &lt;BR /&gt;
&lt;CODE&gt;FIELD_HEADER_REGEX=16\s+Question\s+Name&lt;/CODE&gt;&lt;BR /&gt;
or simply&lt;BR /&gt;
&lt;CODE&gt;FIELD_HEADER_REGEX=Question\s+Name&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Jun 2017 05:39:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364406#M66363</guid>
      <dc:creator>inventsekar</dc:creator>
      <dc:date>2017-06-27T05:39:48Z</dc:date>
    </item>
    <item>
      <title>Re: How to drop log file headers before indexing?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364407#M66364</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;you want to filter out until&lt;BR /&gt;
27. 16 Question Name right? (this contradicts with - REGEX=(?m)^Log file&lt;BR /&gt;
wrap at right)&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Yes I want to filter out the headers, the last header is "16 Question Name".&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;maybe, did you try -&lt;BR /&gt;
FIELD_HEADER_REGEX=16\s+Question\s+Name&lt;BR /&gt;
or simply&lt;BR /&gt;
FIELD_HEADER_REGEX=Question\s+Name&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Before I had the FIELD_HEADER_REGEX working each line such as:&lt;BR /&gt;
Log file wrap at 27/06/2017 2:35:27 PM&lt;BR /&gt;
Message logging key (for packets - other items use a subset of these fields):&lt;/P&gt;

&lt;P&gt;Came as an individual event, now I have a 16 line individual event so I'm fairly confident the FIELD_HEADER_REGEX is doing something, note the FIELD_HEADER_REGEX is on the UF.&lt;/P&gt;

&lt;P&gt;Also FYI I tested the transforms.conf as:&lt;BR /&gt;
REGEX = Log file wrap at &lt;/P&gt;

&lt;P&gt;Still not working, I will now try adding the FIELD_HEADER_REGEX at the indexer level just in case it was supposed to be there (the documentation implies the place of input is where it goes but it's worth a try).&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 14:35:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364407#M66364</guid>
      <dc:creator>gjanders</dc:creator>
      <dc:date>2020-09-29T14:35:41Z</dc:date>
    </item>
    <item>
      <title>Re: How to drop log file headers before indexing?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364408#M66365</link>
      <description>&lt;P&gt;I couldn't get it drop the headers as expected so I've resorted to this for now:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;props.conf
TRANSFORMS-t1 = eliminate-dnsheaders
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[eliminate-dnsheaders]
REGEX = ^[^\d]
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However I'd like to understand about why I cannot drop headers so I've asked splunk support for some advice.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jun 2017 04:23:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364408#M66365</guid>
      <dc:creator>gjanders</dc:creator>
      <dc:date>2017-06-29T04:23:53Z</dc:date>
    </item>
    <item>
      <title>Re: How to drop log file headers before indexing?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364409#M66366</link>
      <description>&lt;P&gt;The answer was surprisingly simple, my regex was an exact match so that stopped the trick from working, what was required was:&lt;BR /&gt;
FIELD_HEADER_REGEX=\s+16\s+Question\s+N&lt;/P&gt;

&lt;P&gt;This way it matches the line but not the entire line, and then the entire section up to that point appears to have been dropped.&lt;/P&gt;

&lt;P&gt;EDIT: there are a few limitations I've found to this, the forwarder in use started mentioning time parsing and line breaking warnings after this setting was applied so I assume it's attempting to parse the logs before forwarding them to the indexer.&lt;BR /&gt;
Furthermore the forwarder's CPU increased in order to process this setting, therefore I've fallen back to performing the work on the indexer rather than the universal forwarder by not using this trick.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 14:44:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-drop-log-file-headers-before-indexing/m-p/364409#M66366</guid>
      <dc:creator>gjanders</dc:creator>
      <dc:date>2020-09-29T14:44:50Z</dc:date>
    </item>
  </channel>
</rss>

