Splunk Search

Why is manual field extraction not working for some fields in some events?

Splunk Employee
Splunk Employee

I have manually set up a search time field extraction with regular expression in the props.conf.
It happens so that one of the fields is not extracted, not for all but for only some events.
For a single sourcetype, there are total of 8 "sourcetype:EXTRACT-" rules specified.

So most of the events have all fields extracted correctly, but some events end up without the field extracted.
But when I add "| extract" to the end of the search, then the result shows with all fields correctly extracted for all events.

What would be the cause of this discrepancy?

0 Karma

Splunk Employee
Splunk Employee
  1. You didn't provide any sample events.
  2. You didn't show your props.conf regular expressions for the field extractions.
  3. You are asking us to guess at a solution with very limited information.

Most likely you used the Splunk UI for your field extractions to create the regular expression and I would bet they are not very specific REGEX. This is where it is most likely failing. We are happy to help once you post more information.

0 Karma

Splunk Employee
Splunk Employee

I played around with the regex and it seems to have fixed the issue. But the original regex does not seem to be wrong. Please see below regex rules.

Original regex: [fw4_rule_traffic]\s[(?P([^]]+|))](|\s)(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^.]+)

Updated regex:^[^[\n]][fw4_rule_traffic]\s+[(?P[^]]+|)]\s*(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P\d+),(?P\d+),(?P\d+),(?P\d+)

Event with v3 (value=dmzfw_1) not successfully extracted. (other fields in the event below and other events are successfully extracted)
timestamp ip_addr 1 timestamp [fw4_rule_traffic] [xx.xx.xx.xx]2017-03-09 10:10:00,dmzfw_1,30,ALLOW,32,13608,0,0.

Some note to sample event.
- masked ip_addr can be an empty string (i.e [])
- There can be a space between [ip_addr] and following timestamp (i.e [x.x.x.x] 2017...)

0 Karma

Champion

Are you using the search dashboard in "verbose" mode? There is a toggleable setting under the time picker that will say "Fast Mode", "Smart Mode", and "Verbose Mode". Make sure it is set to "Verbose Mode" for your testing.

If that does not fix the issues, you will need to share the details of props.conf and your raw data in order for the community to efficiently help you.

0 Karma

Splunk Employee
Splunk Employee

Thanks, as commented below, tweaking the regex did seem to fix the issue but still not clear why the it happened. Because both regex rules do match the event.

0 Karma