Splunk Search

How do you fetch JSON embedded in plain text logs using regex or spath?

maulikdesai21
Engager

I have been running into a problem where I need to fetch the value from JSON data in the log. I am aware of spath but I believe spath expects JSON as an input. However, my data has lots of plain text and JSON mixed together. I have seen similar questions being asked before:

https://answers.splunk.com/answers/151040/how-to-parse-json-mixed-in-with-text-data-or-a-timestamp.h...

I am not sure what's the clean way to do this. Below is a sample log:

mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {"url":"http://mysite.com","error_message":"Blocked a frame with origin \"https://mysite.com\" from accessing a cross-origin frame.","errStack":"{\"isTrusted\":true}","urCounter":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark

I need to group stuff by error_message

0 Karma
1 Solution

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

View solution in original post

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

maulikdesai21
Engager

Thanks @woodcock, that works 🙂

However, I having bit hard time understanding the regex, the part below:

| rex max_match=0 ",\"error_message\":\"(?.*?)\",\"(?

0 Karma

woodcock
Esteemed Legend

Click Accept to close the question. Throw the RegEx into RegEx101.com and it will explain all.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...