Splunk Search

KV_MODE=json sometimes skips a particular JSON field?

zanglang
Engager

We have a log file with multiple lines of JSON similar to this:

{ "foo": "bar","foo1":"foo2","userEmail":"foo@bar.com"}
{ "foo": "bar","foo1":"foo2","userEmail":"foo1@bar.com"}
{ "foo": "bar","foo1":"foo2","userEmail":"foo2@bar.com"}

And search-time extraction works fine for almost all of the fields... except one! Oddly, around 7-8% of all logs do not have userEmail automatically extracted as checked in the Event Coverage, even when I've manually defined it in props.conf. This was verified with the queries:

index=foo | search userEmail=*
index=foo | search NOT userEmail=*

Events are sent from a forwarder with this props.conf:

[foo]
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y%m%d%H%M%S%3N
TIME_PREFIX = \"timestamp\":\"
TZ = UTC
KV_MODE = json
disabled = false
TRUNCATE = 0

I added these on the search head earlier today to force search-time extraction for userEmail, but didn't work, even when I verified the regex catches all emails in Splunk Web:

[foo]
EXTRACT-userEmail = "userEmail":"(?P<userEmail>[^"]+)
KV_MODE = json

Any idea why this might happen?

0 Karma
1 Solution

zanglang
Engager

Turns out the issue was caused by an autolookup with an outdated CSV lookup file where userEmail was one of the autolookup fields. Updated the CSV fixed my issue.

In this case, it appears that all userEmails that did exist in the lookup table would autolookup and extract correctly, and userEmails that were not present in the CSV would fail autolookup and for some reason also broke field extraction.

View solution in original post

0 Karma

zanglang
Engager

Turns out the issue was caused by an autolookup with an outdated CSV lookup file where userEmail was one of the autolookup fields. Updated the CSV fixed my issue.

In this case, it appears that all userEmails that did exist in the lookup table would autolookup and extract correctly, and userEmails that were not present in the CSV would fail autolookup and for some reason also broke field extraction.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

@zanglang If your problem is resolved, please accept an answer to help future readers.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

woodcock
Esteemed Legend

Open a support case.

0 Karma

jhollfelder_spl
Splunk Employee
Splunk Employee

I'm experiencing the same thing.

I have JSON formatted data from the NetApp ONTAP add-on that contains a pool_id field. If I search the correct index and sourcetype and add " |extract" or "| spath", pool_id gets extracted correctly otherwise it extracts what appears to be all other fields except for this one. Scratching head...

0 Karma

jhollfelder_spl
Splunk Employee
Splunk Employee

So it turns out that my problem was caused by a FIELDALIAS setting that was setting a field that actually existed in the data with a field that didn't exist. If I had run btool as suggested by @wenthold, I would have found this much faster.

It turns out that if the props.conf add-on had used "ASNEW" instead of "AS" in the FIELDALIAS definition, Splunk would have kept the field extraction it found in the data rather than overwrite it with a field that didn't exist. The update I made to local/props.conf for that add-on was:
FIELDALIAS-array_id = uuid ASNEW array_id

0 Karma

wenthold
Communicator

Have you tried btool - if you're running on Linux I would try something like

$SPLUNK_HOME/bin/splunk btool props list --debug | grep "userEmail"

Field aliasing, calculated fields, and lookups are all performed at search time after KV extractions, maybe there's an errant config that's overwriting your field.

Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.