Splunk Search

Why are events not returned for a search on a search-time extracted field?

Path Finder

We have a field extraction in apps/search/local/props.conf like this:

[my_glog_kv]
...
EXTRACT-my_glog_kv = ^(?<severity>[IEWF])

And example log event:

F0106 09:02:03.592142  4628 hal_impl.cc:1042] Check failed: logged_ptr != nullptr id="106"

So we expect the Splunk field extraction to put 'F' into the severity field, but for some reason this search does NOT find the above event:

... severity=F

Interestingly, all of these searches do succeed, and Field:severity Value:F is listed in the event viewer in Splunk Web.

... severity=F source=mysource.log
... severity=F nullptr
... severity=F*
... severity=*F
... | regex severity=F

Any help as to why simple severity=F search does not work?

0 Karma
1 Solution

Champion

This is happens because of two facts that are not exactly obvious. One problem is that the "F" you are looking for is not a segment of your event. Furthermore, Splunk will fetch events from disk based on segments - field extraction only happens after the events are fetched.
If you don't want to read up on this rather technical chapter of splunk, suppose for example you have data like this: event a)

valid command df

and b)

invalid command asdfg

Now imagine you want to search for events with valid commands using search "valid command". Splunk will not return event b) even though it literally contains your string, which is the behavior you would expect. The segments this event contains are invalid and command, because they are separated by a whitespace. If you use search "*valid command", you will get both event a) and b) - which is also not surprising.

In the above example, Splunk works as you would expect. In your case unfortunately, this behavior is not giving you the desired result, but it is because of the same reasons: What splunk fetches from disk is determined by segments, and your event doesn't contain a segment "f". Therefore, there are no events when it comes to search-time field extraction.
I hope this helps you understand the issue better. Feel free to come back with any further questions.
Oh and thanks to @martin_mueller for assistance in this matter!

View solution in original post

Champion

This is happens because of two facts that are not exactly obvious. One problem is that the "F" you are looking for is not a segment of your event. Furthermore, Splunk will fetch events from disk based on segments - field extraction only happens after the events are fetched.
If you don't want to read up on this rather technical chapter of splunk, suppose for example you have data like this: event a)

valid command df

and b)

invalid command asdfg

Now imagine you want to search for events with valid commands using search "valid command". Splunk will not return event b) even though it literally contains your string, which is the behavior you would expect. The segments this event contains are invalid and command, because they are separated by a whitespace. If you use search "*valid command", you will get both event a) and b) - which is also not surprising.

In the above example, Splunk works as you would expect. In your case unfortunately, this behavior is not giving you the desired result, but it is because of the same reasons: What splunk fetches from disk is determined by segments, and your event doesn't contain a segment "f". Therefore, there are no events when it comes to search-time field extraction.
I hope this helps you understand the issue better. Feel free to come back with any further questions.
Oh and thanks to @martin_mueller for assistance in this matter!

View solution in original post

Path Finder

Thank you, this is a great answer.

Also explains the further mystery of why our severity=F query sometimes did work. For example, severity=F would find this event:

F0106 09:02:03.592142  4628 other.cc:1042] command="f"

I guess because the "f" is a segment so Splunk fetches (and presumably field-extracts) the event, then discovers hey there is a match on severity=F field.

But this does seem to be a pretty hidden and serious limitation on Splunk search time field extractions: search time field extractions only work on event segments.

I suppose our only option would be make severity an indexed field (index-time extraction).

0 Karma

SplunkTrust
SplunkTrust

I wouldn't go as far as saying "search time field extractions only work on event segments" - more accurately, "extracting search time fields from a partial segment requires additional configuration".

For example, you could tell Splunk to not use the optimization step of only loading events containing the segment f off disk before applying the regular expression in fields.conf - it'll be slower, but the extraction will work.

0 Karma

Champion

Good point - one solution is to make it an indexed field. This implies however any downsides that come with that method.
You could also try to edit your data, either at its source (if you are in control of the application producing these logs) or with SEDCMD during indexing: if you insert a whitespace between the initial two characters, splunk will get segmentation right from the beginning.

0 Karma