In an event with two timestamps, automatically cho...

zapping575 · ‎01-20-2025

I have an event like this:

~01~20241009-100922;899~19700101-000029;578~ASDF~QWER~YXCV

There are two timestamps in this. I have setup my stanza to extract the second one. But in this particular case, the second one is what I consider "bad".

For the record, here is my props.conf:

[QWERTY]
SHOULD_LINEMERGE = true                                                                                                                                                     BREAK_ONLY_BEFORE_DATE = true
MAX_TIMESTAMP_LOOKAHEAD = 43                                                                                                                                                      TIME_FORMAT = %Y%m%d-%H%M%S;%3N
TIME_PREFIX = ^\#\d{2}\#.{0,19}\#                                                                                                                                                     MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                              REPORT-2 = some-report-2

The consequence of this seems to be that splunk indexes the entire file as a single event, which is something i absolutely want to avoid.

Also, I do need to use linemerging as the same file may contain xml dumps.

So what I need is something that implements the following logic:

if second_timestamp_is_bad:
  extract_first_timestamp()
else:
  extract_second_timestamp()

Any tips / hints on how to mitigate this scenario using only options / functionality provided by splunk are greatly appreciated.

PickleRick · ‎01-20-2025

Adding to what's already been said - there is very rarely a legitimate use case for SHOULD_LINEMERGE. Relying on Splunk recognizing something as date to break data stream into events is not a very good idea. You should rather set a proper LINE_BREAKER.

isoutamo · ‎01-20-2025

I suppose that SHOULD_LINEMERGE=true is that way from some historical currently unknown reason and nobody has so brave that change its default to false 😉

richgalloway · ‎01-20-2025

Timestamps are extracted before INGEST_EVAL is performed, so you'll need to use the := operator to replace _time.

These props should work better than those shown.

[QWERTY]
# better performance with LINE_BREAKER
SHOULD_LINEMERGE = false
# We're not breaking events before a date
BREAK_ONLY_BEFORE_DATE = false
# Break events after newline and before "~01~"
LINE_BREAKER = ([\r\n]+)~\d\d~
MAX_TIMESTAMP_LOOKAHEAD = 43
TIME_FORMAT = %Y%m%d-%H%M%S;%3N
# Skip to the second timestamp (after the milliseconds of the first TS)
TIME_PREFIX = ;\d{3}~
MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                           REPORT-2 = some-report-2

---
If this reply helps you, Karma would be appreciated.

isoutamo · ‎01-20-2025

Probably you could do this with INGEST_EVAL? Just test if you can do this with "eval ......" in one line adding several those one after one with suitable if etc. If/when you get it working in SPL then just copy that into transforms.conf into one INGEST_EVAL expression.

zapping575 · ‎01-20-2025

I see that INGEST_EVAL allows for the use of conditionals. Thank you very much, I'll give that a try.

In an event with two timestamps, automatically choose the good one

indexer

props.conf

time

Can’t make it to .conf25? Join us online!

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Index This | How many sevens are there between 1 and 100?

Are you a member of the Splunk Community?