Splunk Search

Field Extraction (Regex) When Column Is Sometimes Absent

RMartinezDTV
Path Finder

Hi, I'm working on a Regex for field extractions of an alert log. The log has 1 line per alert in the following format:

[11/26/2013 9:13:41 AM]     Server1 LogTest: /var/log   Ok      Text Log test
[11/26/2013 9:13:36 AM]     Server1 LogTest: /var/log   Bad <......data.......> Text Log test

The difficulty comes when handling some OK statuses; you'll notice here that a 'Bad' status returns data (the relevant log lines), but an 'Ok' status returns a blank (actually 2 tabs) data section.

It seems like every regex I come up with will accidentally capture some part of Text Log test and use that as part of all of the data section when data isn't present.

Can I get some pointers on the proper regex expression? My current regex is below, and I think I've exhausted the guess and check method. 🙂

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?)\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)
Tags (2)
0 Karma
1 Solution

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

View solution in original post

0 Karma

RMartinezDTV
Path Finder

Ayn, I actually read your notes here: http://answers.splunk.com/answers/67170/index-time-field-extraction about using search-time extractions....and I just learned what the difference is from the docs!

0 Karma

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

0 Karma

kallu
Communicator

Would it work better if you change the end

(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)

to

(?P<status>.+)\t(?P<data>.*)\t(?P<test>.+)

then you should match to an empty string if there is just 2 tabs in case of "Ok"? It sounds too easy and I didn't test it with Splunk, so maybe I'm missing something?

RMartinezDTV
Path Finder

Thanks! This was almost perfect. See my answer below.

0 Karma

lukejadamec
Super Champion

My mistake, a search time field extraction.

0 Karma

Ayn
Legend

DELIMS doesn't work as an index-time extraction, and index-time extractions should be avoided unless you really know what you're doing and why.

lukejadamec
Super Champion

Have you tried setting this up for search time extraction using the log delimiter and a preset series of fields?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...