Getting Data In

What to do with events that have no timestamps

Urbanpope
Explorer

I have been ripping my hair out for the last few nights trying to figure out a solution for this issue. I have a log being ingested by a UF that has some annoying characteristics. Looks a bit like this:

** Process Started **
2021-06-01 14:40:21 INFO Application is loading something
2017-06-01 14:40:22 INFO And another thing
2017-06-01 14:40:23 WARN Something might have broken
** Process Finished **

** Process Started **
2021-06-02 20:15:50 INFO Application has done something interesting
** Process Finished **

Between the two messages are nice, timestamped, single line events. Those ones load up pretty well using defaults but the pesky non-timestamped application messages are causing all sorts of issues. I can't filter them out and it's preferable that events don't start with "** Process Started **".

Best hack i have been able to come up with so far is:
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3Q
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = \d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
HEADER_FIELD_LINE_NUMBER = 2
PREAMBLE_REGEX = Process\s+Started

The first non-timestamped line is ignored and the rest are bundled into the end of an event. But there must be a better way.

Labels (2)
0 Karma
1 Solution

venkatasri
SplunkTrust
SplunkTrust

Hi @Urbanpope 

You can try following it  replaces the header and footer. You shall deploy this props config to HF/indexer.

[ your_sourcetype ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
SEDCMD-removeheadersfooters=s/\*\*\s+(Process Finished|Process Started)\s+\*\*//g
TIME_FORMAT=%Y-%m-%d %H:%M:%S

----

An upvote would be appreciated and accept solution if it helps!

View solution in original post

Urbanpope
Explorer

Thanks for that venkatasri, much appreciated.
With a few tweaks I was able to get it to work in our dev environment.

0 Karma

venkatasri
SplunkTrust
SplunkTrust

That's great. glad it helped.

0 Karma

Urbanpope
Explorer

Thanks for your reply venkatasri . 

Using SEDCMD is a good suggestion, however we are deploying to a UF so pretty limited in what we can do.

If i remember correctly, cooked data skips most of the pipelines on the indexer at index time, but would that also apply to SEDCMD?

0 Karma

venkatasri
SplunkTrust
SplunkTrust

@Urbanpope 

UF functionality is limited to input/forwarding actual cooking happens in HF/indexer. Having said that, if you HF -> indexer then indexer just does the indexing rest of pipelines being skipped because there were already being processed in HF.

SEDCMD works only at index-time that means on HF/indexer. If you have limitation and no control then pre-process the file by removing those lines do not have timestamp and configure UF to monitor pre-processed files.

---

An upvote would be appreciated and accept solution if it helps!

0 Karma

venkatasri
SplunkTrust
SplunkTrust

Hi @Urbanpope 

You can try following it  replaces the header and footer. You shall deploy this props config to HF/indexer.

[ your_sourcetype ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
SEDCMD-removeheadersfooters=s/\*\*\s+(Process Finished|Process Started)\s+\*\*//g
TIME_FORMAT=%Y-%m-%d %H:%M:%S

----

An upvote would be appreciated and accept solution if it helps!

View solution in original post

Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!