Getting Data In

How to index file with multiline events and intermittently occurring timestamps?

zapping575
Communicator

I have a particularly challenging log format and would appreciate any inputs on how to tackle this problem.

Problem

Looking for a feasible props.conf setup that will correctly index the log below

Example (blank lines only added for readability):

 

SINGLE_LINE_LOG_EVENT
SINGLE_LINE_LOG_EVENT
OTHER_SINGLE_LINE_LOG_EVENT

Tue 06 Jun 10:00:00 UTC 2023
ANOTHER_SINGLE_LINE_LOG_EVENT

Tue 06 Jun 10:00:01 UTC 2023
LARGE_MULTILINE_EVENT

 

The first three lines are all single events and should be parsed accordingly. But they have no timestamp

The fourth and fifth line together form a single event

Lines 6 and 7 also form a single event, but the event from line 7 is a multiline event that shall be parsed as a single event

I am prepared to make the sacrifice that the lines without timestamp get assigned the CURRENT timestamp, if there is no other solution for this.

What I have already tried

I tried using the following (the Regex looks for the timestamp)

 

MUST_NOT_BREAK_AFTER = .{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUTC\s\d{4}
MUST_BREAK_AFTER = .{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUTC\s\d{4}

 

 As well as this (I tried various combinations of this, with different capture groups. Note that the file in question only has newlines and no carriage returns, hence no '\r')

 

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\n].{3}\s.{3}\s\d{2}\s\d{2}:\d{2}:\d{2}\sUCT\s\d{4})

 

 

Labels (3)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
Is it a one program which are writing those different log entries or are there several programs to write a single log file?
To be honest I propose that if it's your company/partner's work ask that they will change it to write separate log files if possible.
r. Ismo

zapping575
Communicator

Hi @isoutamo, thanks for the reply

Unfortunately, I dont know how many processes are writing to said file. I can only use it "as is".

You are right however, this issue should be addressed on the side of the application(s) writing to that file.

Regards

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@zapping575 - I think you need to write your own parser that can do that.

  • Basically do not monitor the file directly in Splunk.
  • Instead, write a simple python scripted input in Splunk. (You need to use Heavy Forwarder instead of UF)
  • Then parse the file as you need and with script input you can assign timestamp for each event and ingest each event separately with timestamp extracted from previous values by python code.

 

Also, as I can see you have combination of single line events and multi-line events. That can also be handle in your python code which will act as parser.

 

I hope this helps!!! Kindly upvote if it does!!!!

zapping575
Communicator

Cheers @VatsalJagani 

Thank you for the help.

I cannot use HF, I can only use the UF.

Since there are no other answers, I figure that manually preprocessing is the only way to go in this case.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...