Getting Data In

What to do with events that have no timestamps

Urbanpope
Explorer

I have been ripping my hair out for the last few nights trying to figure out a solution for this issue. I have a log being ingested by a UF that has some annoying characteristics. Looks a bit like this:

** Process Started **
2021-06-01 14:40:21 INFO Application is loading something
2017-06-01 14:40:22 INFO And another thing
2017-06-01 14:40:23 WARN Something might have broken
** Process Finished **

** Process Started **
2021-06-02 20:15:50 INFO Application has done something interesting
** Process Finished **

Between the two messages are nice, timestamped, single line events. Those ones load up pretty well using defaults but the pesky non-timestamped application messages are causing all sorts of issues. I can't filter them out and it's preferable that events don't start with "** Process Started **".

Best hack i have been able to come up with so far is:
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3Q
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = \d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
HEADER_FIELD_LINE_NUMBER = 2
PREAMBLE_REGEX = Process\s+Started

The first non-timestamped line is ignored and the rest are bundled into the end of an event. But there must be a better way.

Labels (2)
0 Karma
1 Solution

venkatasri
SplunkTrust
SplunkTrust

Hi @Urbanpope 

You can try following it  replaces the header and footer. You shall deploy this props config to HF/indexer.

[ your_sourcetype ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
SEDCMD-removeheadersfooters=s/\*\*\s+(Process Finished|Process Started)\s+\*\*//g
TIME_FORMAT=%Y-%m-%d %H:%M:%S

----

An upvote would be appreciated and accept solution if it helps!

View solution in original post

Urbanpope
Explorer

Thanks for that venkatasri, much appreciated.
With a few tweaks I was able to get it to work in our dev environment.

0 Karma

venkatasri
SplunkTrust
SplunkTrust

That's great. glad it helped.

0 Karma

Urbanpope
Explorer

Thanks for your reply venkatasri . 

Using SEDCMD is a good suggestion, however we are deploying to a UF so pretty limited in what we can do.

If i remember correctly, cooked data skips most of the pipelines on the indexer at index time, but would that also apply to SEDCMD?

0 Karma

venkatasri
SplunkTrust
SplunkTrust

@Urbanpope 

UF functionality is limited to input/forwarding actual cooking happens in HF/indexer. Having said that, if you HF -> indexer then indexer just does the indexing rest of pipelines being skipped because there were already being processed in HF.

SEDCMD works only at index-time that means on HF/indexer. If you have limitation and no control then pre-process the file by removing those lines do not have timestamp and configure UF to monitor pre-processed files.

---

An upvote would be appreciated and accept solution if it helps!

0 Karma

venkatasri
SplunkTrust
SplunkTrust

Hi @Urbanpope 

You can try following it  replaces the header and footer. You shall deploy this props config to HF/indexer.

[ your_sourcetype ]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}
SEDCMD-removeheadersfooters=s/\*\*\s+(Process Finished|Process Started)\s+\*\*//g
TIME_FORMAT=%Y-%m-%d %H:%M:%S

----

An upvote would be appreciated and accept solution if it helps!

Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...