Getting Data In

In an event with two timestamps, automatically choose the good one

zapping575
Path Finder

I have an event like this:

 

~01~20241009-100922;899~19700101-000029;578~ASDF~QWER~YXCV

 

There are two timestamps in this. I have setup my stanza to extract the second one. But in this particular case, the second one is what I consider "bad".

For the record, here is my props.conf:

 

[QWERTY]
SHOULD_LINEMERGE = true                                                                                                                                                     BREAK_ONLY_BEFORE_DATE = true
MAX_TIMESTAMP_LOOKAHEAD = 43                                                                                                                                                      TIME_FORMAT = %Y%m%d-%H%M%S;%3N
TIME_PREFIX = ^\#\d{2}\#.{0,19}\#                                                                                                                                                     MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                              REPORT-2 = some-report-2

 

The consequence of this seems to be that splunk indexes the entire file as a single event, which is something i absolutely want to avoid.

Also, I do need to use linemerging as the same file may contain xml dumps.

So what I need is something that implements the following logic:

 

if second_timestamp_is_bad:
  extract_first_timestamp()
else:
  extract_second_timestamp()

 

Any tips / hints on how to mitigate this scenario using only options / functionality provided by splunk are greatly appreciated.

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Adding to what's already been said - there is very rarely a legitimate use case for SHOULD_LINEMERGE. Relying on Splunk recognizing something as date to break data stream into events is not a very good idea. You should rather set a proper LINE_BREAKER.

isoutamo
SplunkTrust
SplunkTrust
I suppose that SHOULD_LINEMERGE=true is that way from some historical currently unknown reason and nobody has so brave that change its default to false 😉
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Timestamps are extracted before INGEST_EVAL is performed, so you'll need to use the := operator to replace _time.

These props should work better than those shown.

[QWERTY]
# better performance with LINE_BREAKER
SHOULD_LINEMERGE = false
# We're not breaking events before a date
BREAK_ONLY_BEFORE_DATE = false
# Break events after newline and before "~01~"
LINE_BREAKER = ([\r\n]+)~\d\d~
MAX_TIMESTAMP_LOOKAHEAD = 43
TIME_FORMAT = %Y%m%d-%H%M%S;%3N
# Skip to the second timestamp (after the milliseconds of the first TS)
TIME_PREFIX = ;\d{3}~
MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                           REPORT-2 = some-report-2
---
If this reply helps you, Karma would be appreciated.

isoutamo
SplunkTrust
SplunkTrust
Probably you could do this with INGEST_EVAL? Just test if you can do this with "eval ......" in one line adding several those one after one with suitable if etc. If/when you get it working in SPL then just copy that into transforms.conf into one INGEST_EVAL expression.

zapping575
Path Finder

I see that INGEST_EVAL allows for the use of conditionals. Thank you very much, I'll give that a try.

0 Karma
Get Updates on the Splunk Community!

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 2)

Welcome to the "Splunk Classroom Chronicles" series, created to help curious, career-minded learners get ...

Index This | I am a number but I am countless. What am I?

January 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  Happy New Year! We’re ...

What’s New in Splunk Enterprise 9.4: Tools for Digital Resilience

PLATFORM TECH TALKS What’s New in Splunk Enterprise 9.4: Tools for Digital Resilience Thursday, February 27, ...