Getting Data In

How to index file with multiline events and intermittently occurring timestamps?

zapping575
Path Finder

I have a particularly challenging log format and would appreciate any inputs on how to tackle this problem.

Problem

Looking for a feasible props.conf setup that will correctly index the log below

Example (blank lines only added for readability):

 

SINGLE_LINE_LOG_EVENT
SINGLE_LINE_LOG_EVENT
OTHER_SINGLE_LINE_LOG_EVENT

Tue 06 Jun 10:00:00 UTC 2023
ANOTHER_SINGLE_LINE_LOG_EVENT

Tue 06 Jun 10:00:01 UTC 2023
LARGE_MULTILINE_EVENT

 

The first three lines are all single events and should be parsed accordingly. But they have no timestamp

The fourth and fifth line together form a single event

Lines 6 and 7 also form a single event, but the event from line 7 is a multiline event that shall be parsed as a single event

I am prepared to make the sacrifice that the lines without timestamp get assigned the CURRENT timestamp, if there is no other solution for this.

What I have already tried

I tried using the following (the Regex looks for the timestamp)

 

MUST_NOT_BREAK_AFTER = .{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUTC\s\d{4}
MUST_BREAK_AFTER = .{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUTC\s\d{4}

 

 As well as this (I tried various combinations of this, with different capture groups. Note that the file in question only has newlines and no carriage returns, hence no '\r')

 

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\n].{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUCT\s\d{4})

 

 

Labels (3)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
Is it a one program which are writing those different log entries or are there several programs to write a single log file?
To be honest I propose that if it's your company/partner's work ask that they will change it to write separate log files if possible.
r. Ismo

zapping575
Path Finder

Hi @isoutamo, thanks for the reply

Unfortunately, I dont know how many processes are writing to said file. I can only use it "as is".

You are right however, this issue should be addressed on the side of the application(s) writing to that file.

Regards

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@zapping575 - I think you need to write your own parser that can do that.

  • Basically do not monitor the file directly in Splunk.
  • Instead, write a simple python scripted input in Splunk. (You need to use Heavy Forwarder instead of UF)
  • Then parse the file as you need and with script input you can assign timestamp for each event and ingest each event separately with timestamp extracted from previous values by python code.

 

Also, as I can see you have combination of single line events and multi-line events. That can also be handle in your python code which will act as parser.

 

I hope this helps!!! Kindly upvote if it does!!!!

zapping575
Path Finder

Cheers @VatsalJagani 

Thank you for the help.

I cannot use HF, I can only use the UF.

Since there are no other answers, I figure that manually preprocessing is the only way to go in this case.

0 Karma
Get Updates on the Splunk Community!

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...