Splunk Search

How to do a field extraction on userid?

mbasharat
Builder

I have an event as below:

2019-07-05 14:00:14 CDT d453bce1-aa68-4674-988e-ed6ab174a1d4 out: ID-sample.sample.com-1562306630255-1-1391 https://sample.sample.com:8675/api/sample/platform/audits {"messageId":"201","messageStatus":"Created","message":"Audit [appName=IDV, userType=TAXFILER, eventId=COLLECT, eventType=RESPONSES, fileSourceCd=IMF, ipAddr=00.00.00.00, returnCd=SUCCESS, sessionId=OLA_934d5c5f-974d-4b65-b0ca-288f03d5993e, vardata=<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><saasVarData><body>{\"deviceId\":\"ABC5426\",\"ipAddress\":\"00.00.00.00\",\"answers\":[{\"questionNumber\":\"1\",\"answer\":\"Y\"},{\"questionNumber\":\"2\",\"answer\":\"Y\"},{\"questionNumber\":\"3\",\"answer\":\"N\"}]}</body><host>sample1.sample.net</host><ipAddress>00.00.00.00</ipAddress><requestId>d453bce1-aa68-4674-988e-ed6ab174a1d4</requestId><responseStatus>0</responseStatus><uri>/ola/id-verify/responses</uri><userId>C3C7EA8A-8B7A-4574-BCB9-FC326816E63B</userId></saasVarData>]"}

I want to do field extraction on userId. I used Splunk field extraction using RegEx method. After extraction, when I try to run searches against this field, it does not populate/provide counts correctly. It is picking only some field values and placing remaining ones under unknown. What am I missing? Thanks in advance.

The RegEx that Splunk created for me is:

^\d+\-\d+\-\d+\s+\d+:\d+:\d+\s+\w+\s+[a-f0-9]+\-\d+\-[a-f0-9]+\-[a-f0-9]+\-[a-f0-9]+\s+\w+:\s+\w+\-\w+\d+\w+\d+\-\w+\-\w+\-\w+\-\d+\-\d+\-\d+\s+\w+://\w+\d+\w+\d+\.\w+\.\w+\.\w+:\d+/\w+/\w+\-\w+/\w+/\w+\s+\{"\w+":"\d+","\w+":"\w+","\w+":"\w+\s+\[\w+=\w+\s+\w+\s+\d+\s+\d+:\d+:\d+\s+\w+\s+\d+,\s+\w+=\w+,\s+\w+=\w+,\s+\w+=\w+,\s+\w+=\w+,\s+\w+=\w+,\s+\w+=\d+\.\d+\.\d+\.\d+,\s+\w+=\w+,\s+\w+=\w+_[a-f0-9]+\-\d+\-[a-f0-9]+\-[a-f0-9]+\-[a-f0-9]+,\s+\w+=<\?\w+\s+\w+=\\"\d+\.\d+\\"\s+\w+=\\"\w+\-\d+\\"\s+\w+=\\"\w+\\"\?><\w+><\w+>\w+\d+\w+\.\w+\.\w+\.\w+</\w+><\w+>\d+\.\d+\.\d+\.\d+</\w+><\w+>\{\\"\w+\\":\w+,\\"\w+\\":\\"\w+\\",\\"\w+\\":\w+\}</\w+><\w+>[a-f0-9]+\-\d+\-[a-f0-9]+\-[a-f0-9]+\-[a-f0-9]+</\w+><\w+>\d+</\w+><\w+>/\w+/\w+\-\w+/\w+</\w+><\w+>(?P[^<]+)
0 Karma
1 Solution

rbechtold
Communicator

Hey Mbasharat,

If you're just trying to extract the userId field, this should work for you:

...BASE SEARCH...
| rex field=_raw "\<userId\>(?<userId>[^\<]+)"

However, what may be more useful to you is looking into the xmlkv command. Try it by adding this to your search:

...BASE SEARCH...
|table _time _raw
| xmlkv

There is decent chunk of your data in XML format (<field>value</field>). This this command will automatically find and extract those fields. Since userId is in XML format, it too will automatically be extracted.

Here is the documentation on the xmlkv command if you're interested on learning more about how it works:
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Xmlkv

View solution in original post

0 Karma

rbechtold
Communicator

Hey Mbasharat,

If you're just trying to extract the userId field, this should work for you:

...BASE SEARCH...
| rex field=_raw "\<userId\>(?<userId>[^\<]+)"

However, what may be more useful to you is looking into the xmlkv command. Try it by adding this to your search:

...BASE SEARCH...
|table _time _raw
| xmlkv

There is decent chunk of your data in XML format (<field>value</field>). This this command will automatically find and extract those fields. Since userId is in XML format, it too will automatically be extracted.

Here is the documentation on the xmlkv command if you're interested on learning more about how it works:
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Xmlkv

0 Karma

mbasharat
Builder

The last solution is what I liked! 🙂 THANK YOU!!

0 Karma

rbechtold
Communicator

Hey Mbasharat,

do you need the entire log extracted, or just the userId field?

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...