Splunk Search

After creating an extracted field with the Field Extractor in Splunk Web, why are new values not appearing when searching the field?

mike314
Explorer

I've created an extracted field using the field extractor GUI in Splunk Seb. When I created it, there were two values for that field. Now that further logs have been processed, there is a new value for that extracted field.

The issue is that the new value does not appear in the field summary, only the previous two values show up. Also, searches for where the extracted field equals the new value do not return any results. But when I search for the new value as just text, the results are actually there.

Specifically, this is extracting the log message type (INFO,WARN,FATAL) from a custom application we've built. The regular expression generated by the field extractor GUI is: ^[^\[\n]*\[(?P<Event_type>\w+) which is meant to match the tag inside of the brackets from logs that look like this:

2016-12-14 01:02:03 [INFO] Process started.
2016-12-14 01:03:04 [WARN] Some error has happened.
2016-12-14 01:03:44 [INFO] Reticulating splines.
2016-12-14 01:04:05 [FATAL] Process failed!

You can see here that the extracted field is working for two values:
splunk field value summary

I've even tried to use the field extractor GUI again on one of the results that does have the new value for this field. But it shows that it is already recognized as the extracted field I created:
Splunk field extractor view

So why is the new value not appearing in the summary or able to be searched directly using the extracted field?

gokadroid
Motivator

Please try this regex and it should work all the time whether you use it at searchtime or extraction time:

\d\s\[(?<Event_type>[^\]]+)\]\s

See extraction here

If you want to search, try:

your query to return events
| rex field=_raw "\d\s\[(?<Event_type>[^\]]+)\]\s"
| table Event_type
0 Karma

mike314
Explorer

Thanks for the answer. But I'm curious to know why the regex generated by the UI doesn't work as I would expect it to? And for that matter how/why does the regex you've provided fix it? This is something I assume I'll be needing to handle in the future so I'd like a better understanding of where I went wrong. I did look at the documentation for field extraction but I didn't see anything that seemed to call out that this would happen. Links to relevant doco would be very appreciated as well.

0 Karma

gokadroid
Motivator

Since the regex that gets built in extraction considers the logline that you have selected, hence chances are the regex that got built might not be suitable for some of the cases and regex that got built is more strict to match the logline like cases.

Why my regex worked, as it was more generalized regex which fitted all cases you provided:

look for a digit, then a space, then a square opening bracket and capture everything till you see the closing square bracket, look for a closing square bracket and a space to follow

How to check if something like this will happen again?

Click on matches and non-matches tabs during extraction to see if if there are some cases where the regex which got built didn't work out so you can tweak something in there. If the non-matches are 0 then the regex will work in all the cases which got loaded.

0 Karma

mike314
Explorer

I've updated the regular expression for my existing extracted field to the one you provided and it still only returns the original two values. Is there some setting I'm missing to force it to re-index the values for the field? The settings for the extracted field appears to just be the expression used to define it.

0 Karma

mike314
Explorer

In fact, after changing the expression the field extractor stopped showing me that [FATAL] was an extracted field. So perhaps the original expression works better? I'm quite rusty at regex so I'm not inclined to guess.

Either way the problem seems to be that Splunk isn't figuring out that this extracted field has a new value despite the fact that the regex is valid to match it. I feel like there is something wrong in either the way I've configured the field or my understanding of how extracted fields work. Or maybe it is a bug? But probably too early to call it that...

0 Karma

mike314
Explorer

But when I use your search query and replace your expression with the one generated by the GUI (source_query | rex field=_raw "^[^\[\n]*\[(?P<Event_type>\w+)" | stats count BY Event_type) it finds all three values but just looking at the extracted field I've built it doesn't show up. So I'm confused. The generated regex seems valid.
The only difference I see in regex101 is that your expression matches all the rows whereas the generated expression only matches the first row and then stops. (It doesn't matter which row is first though. So it will match the [FATAL] log line.)

0 Karma

mike314
Explorer

Mainly I'd just like to understand why one works and the other doesn't.

0 Karma

gokadroid
Motivator

For the log lines provided above either one of the regex should have worked, unless some other "case" tumbled it over. Hence always try to see matches and non-matches tab when extracting from field extractor.

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...