Getting Data In

How to remove headers from a custom app log file?

a212830
Champion

Hi,

I'd like to remove some headers from a custom app logfile. I've tried some configs, but can't get it to work.

Here's my props (which doesn't filter out the lines).

ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+)\d{2}:\d{2}:\d{2}.\d{3}
MAX_TIMESTAMP_LOOKAHEAD = 150
NO_BINARY_CHECK = 1
FIELD_HEADER_REGEX=^Genesys*+File
SHOULD_LINEMERGE = false
TRUNCATE = 999999

Here's some sample data - I'd like to remove everything from Genesys to (and including) the File line.

Genesys Orchestration, Version:'8.1.300.46'
Copyright (c) 2008-2013 Genesys Telecommunications Laboratories, Inc.
Component versions:
Commonlib:        8.1.300.29 C2
Loglib:           8.1.300.09 MT
Gmessagelib:      8.1.300.00
GServicelib:      8.1.300.06 MT
Confservlib:      8.1.300.06
Lcalib:           8.1.300.07
T-library         8.1.200.05 HA
SCXMLlib          8.1.300.52
Build platform:   i686-linux-rhe5,64bit
Application name: ORS_RTP_Node1_BK
Application type: OrchestrationServer (161)
Command line:     ./orchestration -app ORS_RTP_Node1_BK -host myhost -port 2120 
Host name:        myhost
DST:              TZ = 1, timeb = 0
Time zone:        18000, EST, EDT
UTC time:         2015-01-07T18:31:25.493
Local time:       2015-01-07T13:31:25.493
Start time (UTC): 2014-09-13T05:50:00
Running time:     116:12:41:25
Host info:        Linux, abcdef, 2.6.18-371.8.1.0.1.el5, #1 SMP Thu Apr 24 13:43:12 PDT 2014, x86_64
File:             (981) /host123/logs/ORS_RTP_Node1_B/ORS_RTP_Node1_BK.20150107_133125_493.log

13:31:25.493 [ORSCallMonitor] OnCallInfoChanged
13:31:25.493 [IDX]: >> GET >> FMID=01MSGO2AM0A9F7MAJJ45U2LAES0018GG NOT FOUND
13:31:25.493 <<<=== 'EventCallDataChanged'(161) seq=98d28e
13:31:25.493 Int 04543 Interaction message "EventCallDataChanged" received from 66723
Tags (2)
1 Solution

chanfoli
Builder

Based on this blog post (http://blogs.splunk.com/2013/10/22/dropping-useless-headers-in-splunk-6/) I think what you want to do with the FIELD_HEADER_REGEX to get splunk to skip the file header, is to match the last line of a file header so the following might work for you:

FIELD_HEADER_REGEX=^File:

View solution in original post

chanfoli
Builder

Based on this blog post (http://blogs.splunk.com/2013/10/22/dropping-useless-headers-in-splunk-6/) I think what you want to do with the FIELD_HEADER_REGEX to get splunk to skip the file header, is to match the last line of a file header so the following might work for you:

FIELD_HEADER_REGEX=^File:

a212830
Champion

Awesome. Thanks! Nice and simple.

0 Karma

chanfoli
Builder

Failing the above, you could also try:

HEADER_FIELD_LINE_NUMBER = 25

The use of either option assumes that your application will not be appending headers to the same log if it is restarted and that the number of header lines remains consistent. If this is not the case then you will need to come up with a regex which matches all the header lines ( I am thinking of matching on not starting with a time string) and use a transform to send matching lines to the null queue.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...