I've had a read of dropping useless headers in Splunk 6 and tried using the FIELD_HEADER_REGEX, in fact I also tried the HEADER_FIELD_LINE_NUMBER trick but that did not work as expected either.
The blog post says:
I stole some of this from the
Websphere App but added the
FIELD_HEADER_REGEX. This tells Splunk
to look for that last line of the
header from above:
************* End Display Current Environment *************
And start indexing events after that.
You could also use
HEADER_FIELD_LINE_NUMBER if your data
writes a consistent number of header
lines.
The default props.conf is:
[MSAD:NT6:DNS]
CHECK_FOR_HEADER = 0
REPORT_KV_for_microsoft_dns_web = KV_for_port,KV_for_Domain,KV_for_RecvdIP,KV_for_microsoftdns_action,KV_for_Record_type,KV_for_Record_Class
SHOULD_LINEMERGE = false
For the props.conf on the universal forwarder I added:
[MSAD:NT6:DNS]
#Drop the header lines from the file
FIELD_HEADER_REGEX=\s+16\s+Question\s+Name
That does something, it combines the header so it looks like this:
Log file wrap at 27/06/2017 2:35:27 PM
Message logging key (for packets - other items use a subset of these fields):
Field # Information Values
------- ----------- ------
1 Date
2 Time
3 Thread ID
4 Context
5 Internal packet identifier
6 UDP/TCP indicator
7 Send/Receive indicator
8 Remote IP
9 Xid (hex)
10 Query/Response R = Response
blank = Query
11 Opcode Q = Standard Query
N = Notify
U = Update
? = Unknown
12 [ Flags (hex)
13 Flags (char codes) A = Authoritative Answer
T = Truncated Response
D = Recursion Desired
R = Recursion Available
14 ResponseCode ]
15 Question Type
16 Question Name
Which is an improvement over having 16 random events that relate to the header, but it has not dropped.
On the indexing tier I tried:
[MSAD:NT6:DNS]
TRANSFORMS-t1 = eliminate-DNSHeaders
And:
[eliminate-DNSHeaders]
REGEX=(?m)^Log file wrap at
DEST_KEY = queue
FORMAT = nullQueue
I also tried without the ?m , clearly I'm missing something but I am not sure how I should drop the header records, they are not useful...
If someone can let me know how to drop these records, the last setting was placed on the indexers not the universal forwarders.
The answer was surprisingly simple, my regex was an exact match so that stopped the trick from working, what was required was:
FIELD_HEADER_REGEX=\s+16\s+Question\s+N
This way it matches the line but not the entire line, and then the entire section up to that point appears to have been dropped.
EDIT: there are a few limitations I've found to this, the forwarder in use started mentioning time parsing and line breaking warnings after this setting was applied so I assume it's attempting to parse the logs before forwarding them to the indexer.
Furthermore the forwarder's CPU increased in order to process this setting, therefore I've fallen back to performing the work on the indexer rather than the universal forwarder by not using this trick.
The answer was surprisingly simple, my regex was an exact match so that stopped the trick from working, what was required was:
FIELD_HEADER_REGEX=\s+16\s+Question\s+N
This way it matches the line but not the entire line, and then the entire section up to that point appears to have been dropped.
EDIT: there are a few limitations I've found to this, the forwarder in use started mentioning time parsing and line breaking warnings after this setting was applied so I assume it's attempting to parse the logs before forwarding them to the indexer.
Furthermore the forwarder's CPU increased in order to process this setting, therefore I've fallen back to performing the work on the indexer rather than the universal forwarder by not using this trick.
I couldn't get it drop the headers as expected so I've resorted to this for now:
props.conf
TRANSFORMS-t1 = eliminate-dnsheaders
transforms.conf
[eliminate-dnsheaders]
REGEX = ^[^\d]
DEST_KEY = queue
FORMAT = nullQueue
However I'd like to understand about why I cannot drop headers so I've asked splunk support for some advice.
you want to filter out until
27. 16 Question Name
right? (this contradicts with - REGEX=(?m)^Log file wrap at
right)
maybe, did you try -
FIELD_HEADER_REGEX=16\s+Question\s+Name
or simply
FIELD_HEADER_REGEX=Question\s+Name
you want to filter out until
27. 16 Question Name right? (this contradicts with - REGEX=(?m)^Log file
wrap at right)
Yes I want to filter out the headers, the last header is "16 Question Name".
maybe, did you try -
FIELD_HEADER_REGEX=16\s+Question\s+Name
or simply
FIELD_HEADER_REGEX=Question\s+Name
Before I had the FIELD_HEADER_REGEX working each line such as:
Log file wrap at 27/06/2017 2:35:27 PM
Message logging key (for packets - other items use a subset of these fields):
Came as an individual event, now I have a 16 line individual event so I'm fairly confident the FIELD_HEADER_REGEX is doing something, note the FIELD_HEADER_REGEX is on the UF.
Also FYI I tested the transforms.conf as:
REGEX = Log file wrap at
Still not working, I will now try adding the FIELD_HEADER_REGEX at the indexer level just in case it was supposed to be there (the documentation implies the place of input is where it goes but it's worth a try).