I have manually set up a search time field extraction with regular expression in the props.conf.
It happens so that one of the fields is not extracted, not for all but for only some events.
For a single sourcetype, there are total of 8 "sourcetype:EXTRACT-" rules specified.
So most of the events have all fields extracted correctly, but some events end up without the field extracted.
But when I add "| extract" to the end of the search, then the result shows with all fields correctly extracted for all events.
What would be the cause of this discrepancy?
Most likely you used the Splunk UI for your field extractions to create the regular expression and I would bet they are not very specific REGEX. This is where it is most likely failing. We are happy to help once you post more information.
I played around with the regex and it seems to have fixed the issue. But the original regex does not seem to be wrong. Please see below regex rules.
Original regex: [fw4_rule_traffic]\s[(?P([^]]+|))](|\s)(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^.]+)
Updated regex:^[^[\n]][fw4_rule_traffic]\s+[(?P[^]]+|)]\s*(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P\d+),(?P\d+),(?P\d+),(?P\d+)
Event with v3 (value=dmzfw_1) not successfully extracted. (other fields in the event below and other events are successfully extracted)
timestamp ip_addr 1 timestamp [fw4_rule_traffic] [xx.xx.xx.xx]2017-03-09 10:10:00,dmzfw_1,30,ALLOW,32,13608,0,0.
Some note to sample event.
- masked ip_addr can be an empty string (i.e [])
- There can be a space between [ip_addr] and following timestamp (i.e [x.x.x.x] 2017...)
Are you using the search dashboard in "verbose" mode? There is a toggleable setting under the time picker that will say "Fast Mode", "Smart Mode", and "Verbose Mode". Make sure it is set to "Verbose Mode" for your testing.
If that does not fix the issues, you will need to share the details of props.conf and your raw data in order for the community to efficiently help you.
Thanks, as commented below, tweaking the regex did seem to fix the issue but still not clear why the it happened. Because both regex rules do match the event.