Splunk Search
Highlighted

Why is my field extraction not consistent across all events?

Path Finder

I want to extract a field which is uuid format and name it instanceid.

props.conf settings

EXTRACT-fields_5 = \[[i]nstance:\s+(?P<instanceid>[0-9a-f]{8}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{12})

For logs like ...

2017-01-01 00:00:00.000 99999 INFO xxxxxxxxxxxx [-] [instance: 01234567-89ab-cdef-0123-456789abcdef] Instance destroyed successfully.

However, it works for some events but it doesn't for some other events.
When I changed the field name to nstanceid or istanceid in regex, it works for all events. I don't know what's wrong with the field name instanceid.
OTOH, rex command with above regex (field name is instanceid) works well.

Would somebody give me the reason why??

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

SplunkTrust
SplunkTrust

Hi diavolo,

try the following.

(?:\[instance:\s+)(?P<instanceid>[0-9a-f]{8}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{4}\-[0-9a-f]{12})(?:\])

Should work fine now.

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

Path Finder

Unfortunately, it doesn't work. The field can't be extracted in some events.

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

SplunkTrust
SplunkTrust

1) when you say "change the field name" are you talking about the underlying data, or the field name being extracted by the regex?
2) can you post an example of an event that the extract did NOT work for?

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

Path Finder

1) The latter one. I changed regex from (?P<instanceid>...) to (?P<nstanceid>...). It worked.
2)
- Worked:
2017-01-06 03:08:35.416 21995 INFO nova.virt.libvirt.driver [-] [instance: 40624b9c-8179-4cb0-82ec-924ee5362cc0] Instance destroyed successfully.
- Not Worked:
2017-01-06 03:07:25.932 21995 DEBUG nova.network.neutronv2.api [-] [instance: 6708c71b-0f49-4b0b-8040-fec13e3e2a4b] getinstancenwinfo() _getinstancenwinfo /usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py:602

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

SplunkTrust
SplunkTrust

The problem may be the (?P at the beginning of the regex.

Also, I believe you can shorthand hex digits as \h, so your regex can look a bit cleaner if you try this -

 EXTRACT-fields_5 = \[instance:\s+(?<instanceid>\h{8}\-\h{4}\-\h{4}\-\h{4}\-\h{12})

see this page for more details - http://www.regular-expressions.info/refext.html

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

Path Finder

? didn't fix the problem... Also, \h for hex didn't work.

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

SplunkTrust
SplunkTrust

Hi diavolo,

my guess would be that in some events there is actually a field called instanceid.
Try to use a completely new/different field name to test your field extraction, something like this should work for you:

 \[instance:\s+(?<ThisIsMyTestFieldName>[^\]]+)

cheers, MuS

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

Path Finder

Thanks MuS,
instanceid is not used anywhere. Changing field name like instance_id works fine. But I was wondering why...

0 Karma
Highlighted

Re: Why is my field extraction not consistent across all events?

Path Finder

Mmm... After I changed the extracted field name in regex from instanceid to instance_id for workaround, it doesn't work for some events. It worked fine soon after I did change, but 1 hour later, it doesn't.

0 Karma