Splunk Search

Field extraction not working

tkwaller
Builder

Hello

I have a field extraction to extract email address from a wso2 log and rename it as user.

So this log:

2016-07-11 20:38:30,633 priority sampledata-not_real-1111-simple-90 mydata.platform.stuff.yea.morestuff field=handler method=value scopeValue=email_address=myemail@smile.com|something:stuff=me&app=hello_stuff id=""

I have set to extract:

scopeValue=email_address=(?P<user>[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})

When I run:

index=* user=myemail@smile.com earliest=-48h@h  sourcetype="wso2:am:runtime" "scopeValue=email_address=" | stats  count as "UserCountUsingField"

I get the above log with the email in the user field

When I run this search I do not get that log:

index=* user=myemail@smile.com earliest=-48h@h  sourcetype="wso2:am:runtime"  | stats  count as "UserCountUsingField"

Any idea why that wouldn't be working?
Thanks for the help!

0 Karma

acharlieh
Influencer

This sounds very similar to https://answers.splunk.com/answers/102528/field-discovery-extraction-works-but-extracted-field-value...

I wonder if you need a fields.conf on your search head with:

[user]
INDEXED_VALUE = false

to solve this issue. There might be a more efficient way with adjusting tokenization per the other answer, but perhaps this will work? The unfortunate thing is that this impacts all fields called user not just that in your particular sourcetype (since this is on the building of the search end, no data yet).

0 Karma

dshpritz
SplunkTrust
SplunkTrust

That regex is a little bit of overkill if all you want is the user. You could try something like this in the sourcetype stanza in props.conf:

EXTRACT-email_user = email_address=(?<user>[^|]+)

Some explanation:

This regex is looking for the string "email_address=" and then the capture group contains a negated character class which says "all characters until a pipe".

HTH,

Dave

tkwaller
Builder

I added this and reapplied the configs. Still don't get this record when searching:
index=* user=myemail@smile.com earliest=-48h@h sourcetype="wso2:am:runtime" | stats count as "UserCountUsingField"

0 Karma

dshpritz
SplunkTrust
SplunkTrust

If you search without the stats part, do you see the "user" field in your field list?

0 Karma

tkwaller
Builder

This record is not returned at all when searching UNLESS you use "scopeValue", this returns the log I am looking for:
index=* user=myemail@smile.com earliest=-48h@h sourcetype="wso2:am:runtime" "scopeValue"

This returns nothing:
index=* user=myemail@smile.com earliest=-48h@h sourcetype="wso2:am:runtime"

0 Karma

dshpritz
SplunkTrust
SplunkTrust

right, but in that case specifying a user seems incidental. As I asked before, if you search without the stats part, do you see the "user" field in your field list?

0 Karma

tkwaller
Builder

If this log is not returned then there won't be a user field

0 Karma

dshpritz
SplunkTrust
SplunkTrust

What I'm getting at is, does the field extraction work. If you look at events that should have this field extracted, is the field showing up?

0 Karma

tkwaller
Builder

It appears that the extraction is only partly working. For some addresses it works and others it does not but I have not found WHY as the addresses it works on are the same format that it does not

0 Karma

gcusello
SplunkTrust
SplunkTrust

I don't know if the square brackets is a problem of the post.
I tested your regex on
https://regex101.com/

just a little bit modified:

scopeValue\=email_address\=(?P[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+)

and I have the result you want:

myemail@smile.com

Bye.
Giuseppe

0 Karma

tkwaller
Builder

I added this and reapplied the configs. Still don't get this record when searching:

index=* user=myemail@smile.com earliest=-48h@h  sourcetype="wso2:am:runtime"  | stats  count as "UserCountUsingField"
0 Karma

gcusello
SplunkTrust
SplunkTrust

try using double quotes.

index=* user="myemail@smile.com" earliest=-48h@h sourcetype="wso2:am:runtime" | stats count as "UserCountUsingField"

Bye.
Giuseppe

0 Karma

tkwaller
Builder

Unfortunately I get the same results

0 Karma

tkwaller
Builder

NOTE: I had to use brackets instead of the proper <> for the field name in the regex because of formatting in this page

0 Karma

sundareshr
Legend

How did you set to extract the email address? Check the permission for the field extraction.

0 Karma

tkwaller
Builder

Permissions are Global: readable by all and writable to admin and power

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...