Getting Data In

In an event with two timestamps, automatically choose the good one

zapping575
Path Finder

I have an event like this:

 

~01~20241009-100922;899~19700101-000029;578~ASDF~QWER~YXCV

 

There are two timestamps in this. I have setup my stanza to extract the second one. But in this particular case, the second one is what I consider "bad".

For the record, here is my props.conf:

 

[QWERTY]
SHOULD_LINEMERGE = true                                                                                                                                                     BREAK_ONLY_BEFORE_DATE = true
MAX_TIMESTAMP_LOOKAHEAD = 43                                                                                                                                                      TIME_FORMAT = %Y%m%d-%H%M%S;%3N
TIME_PREFIX = ^\#\d{2}\#.{0,19}\#                                                                                                                                                     MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                              REPORT-2 = some-report-2

 

The consequence of this seems to be that splunk indexes the entire file as a single event, which is something i absolutely want to avoid.

Also, I do need to use linemerging as the same file may contain xml dumps.

So what I need is something that implements the following logic:

 

if second_timestamp_is_bad:
  extract_first_timestamp()
else:
  extract_second_timestamp()

 

Any tips / hints on how to mitigate this scenario using only options / functionality provided by splunk are greatly appreciated.

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Adding to what's already been said - there is very rarely a legitimate use case for SHOULD_LINEMERGE. Relying on Splunk recognizing something as date to break data stream into events is not a very good idea. You should rather set a proper LINE_BREAKER.

isoutamo
SplunkTrust
SplunkTrust
I suppose that SHOULD_LINEMERGE=true is that way from some historical currently unknown reason and nobody has so brave that change its default to false 😉
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Timestamps are extracted before INGEST_EVAL is performed, so you'll need to use the := operator to replace _time.

These props should work better than those shown.

[QWERTY]
# better performance with LINE_BREAKER
SHOULD_LINEMERGE = false
# We're not breaking events before a date
BREAK_ONLY_BEFORE_DATE = false
# Break events after newline and before "~01~"
LINE_BREAKER = ([\r\n]+)~\d\d~
MAX_TIMESTAMP_LOOKAHEAD = 43
TIME_FORMAT = %Y%m%d-%H%M%S;%3N
# Skip to the second timestamp (after the milliseconds of the first TS)
TIME_PREFIX = ;\d{3}~
MAX_DAYS_AGO = 10951
REPORT-1 = some-report-1                                                                                                                                           REPORT-2 = some-report-2
---
If this reply helps you, Karma would be appreciated.

isoutamo
SplunkTrust
SplunkTrust
Probably you could do this with INGEST_EVAL? Just test if you can do this with "eval ......" in one line adding several those one after one with suitable if etc. If/when you get it working in SPL then just copy that into transforms.conf into one INGEST_EVAL expression.

zapping575
Path Finder

I see that INGEST_EVAL allows for the use of conditionals. Thank you very much, I'll give that a try.

0 Karma
Get Updates on the Splunk Community!

Technical Workshop Series: Splunk Data Management and SPL2 | Register here!

Hey, Splunk Community! Ready to take your data management skills to the next level? Join us for a 3-part ...

Spotting Financial Fraud in the Haystack: A Guide to Behavioral Analytics with Splunk

In today's digital financial ecosystem, security teams face an unprecedented challenge. The sheer volume of ...

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability As businesses scale ...