Archive

How to get field names from log header and removing said header?

Communicator

I have a log glf log file that I need to get some info out of the heads to format the log data, but other than that, I don't need it. And example:

FILE_TYPE:DAAA96DE-B0FB-4c6e-AF7B-A445F5BF9BE2  
ENCODING:UTF-8  
RECORD_SEPARATOR:30  
COLUMN_SEPARATOR:124  
ESC_CHARACTER:27  
COLUMNS:Location|Guid|Time|Tzone|Trace|Log|Importance|Severity|Exception|DeviceName|ProcessID|ThreadID|ThreadName|ScopeTag|MajorTick|MinorTick|MajorDepth|MinorDepth|RootName|RootID|CallerName|CallerID|CalleeName|CalleeID|ActionID|DSRRootContextID|DSRTransaction|DSRConnection|DSRCounter|User|ArchitectComponent|DeveloperComponent|Administrator|Unit|CSNComponent|Text  
SEVERITY_MAP: |None| |Success|W|Warning|E|Error|A|Assertion  
HEADER_END  
registry.cpp:338:-: TraceLog message 1  
|68da48f2-6079-4c34-ab80-22a8a3eff698|2019 06 14 06:49:37:792|-0500|Error| |>>|E| |w3wp|10936|5388|| ||||||||||||||||||||||assert failure: (registry.cpp:338). (FALSE : RegOpenKeyEx(Software\SAP BusinessObjects\Enterprise\CMSClusterMembers) failed. Error code: 5).  
registry.cpp:338:-: TraceLog message 2  
|83a6bb38-f5e9-5ac4-7924-8f78e98907b9|2019 06 14 06:49:37:976|-0500|Error| |>>|E| |w3wp|10936|5388|| |0|0|0|0|-|-|-|-|-|-||||||||||||assert failure: (registry.cpp:338). (FALSE : RegOpenKeyEx(Software\SAP BusinessObjects\Enterprise\CMSClusterMembers) failed. Error code: 5).  
registry.cpp:761:-: TraceLog message 3  
|2716def7-9ddf-c8a4-b8bd-af11251e2a38|2019 06 14 06:49:37:977|-0500|Error| |>>|E| |w3wp|10936|5388|| |0|0|0|0|-|-|-|-|-|-||||||||||||assert failure: (registry.cpp:761). (FALSE : RegOpenKeyEx(Software\SAP BusinessObjects\Enterprise\CMSClusterMembers) failed. Error code: 5).  

The first 5 lines, as well as lines 7 and 8 are junk. I need all of line 6 to be used to format the rest of the data, execpt the word "COLUMNS:".

I know I can tell Splunk to ignore the 1st 8 lines, but is there away to do that, but also pull the info from line 6?

0 Karma

SplunkTrust
SplunkTrust

Consider writing a scripted input that reads the file, strips out the unwanted bits, performs necessary transforms, and writes the results to stdout for indexing.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

Super Champion

Hi @cboillot,

From what I understand you've managed to get rid of the lines you don't need, not sure what method you used but you could use this option in props.conf to get rid of the lines at the start:

PREAMBLE_REGEX = <regex>
* A regular expression that lets Splunk software ignore "preamble lines",
  or lines that occur before lines that represent structured data.
* When set, Splunk software ignores these preamble lines,
  based on the pattern you specify.
* Default: not set

If you want to tell Splunk to read the header from a specific line you can use the following options in props.conf :

HEADER_FIELD_LINE_NUMBER = <integer>
* The line number of the line within the specified file or source that
  contains the header fields.
* If set to 0, Splunk software attempts to
  locate the header fields within the file automatically.
* Default: 0

You can set it to 6 if it's the 6th line that you need.

Let me know how that works out for you.

Cheers,
David

0 Karma