Splunk Search

How do you fetch JSON embedded in plain text logs using regex or spath?

maulikdesai21
Engager

I have been running into a problem where I need to fetch the value from JSON data in the log. I am aware of spath but I believe spath expects JSON as an input. However, my data has lots of plain text and JSON mixed together. I have seen similar questions being asked before:

https://answers.splunk.com/answers/151040/how-to-parse-json-mixed-in-with-text-data-or-a-timestamp.h...

I am not sure what's the clean way to do this. Below is a sample log:

mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {"url":"http://mysite.com","error_message":"Blocked a frame with origin \"https://mysite.com\" from accessing a cross-origin frame.","errStack":"{\"isTrusted\":true}","urCounter":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark

I need to group stuff by error_message

0 Karma
1 Solution

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

View solution in original post

woodcock
Esteemed Legend

Why not just do this:

index=YouShouldAlwaysSpecifyAnIndex sourcetype=AndSourcetypeToo
| rex max_match=0 ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"
| stats count by error_message

See this run-anywhere example:

| makeresults
| eval _raw="mysite/844e7cca96f7 EventTime=2019-03-22T20:36:53.920Z LogLevel=error iLogger uievent=unhandledrejection {\"url\":\"http://mysite.com\",\"error_message\":\"Blocked a frame with origin \\\"https://mysite.com\\\" from accessing a cross-origin frame.\",\"errStack\":\"{\\\"isTrusted\\\":true}\",\"urCounter\":1} sm_serversessionid=iVML0bvuCTjsdkfSW/sEp7WgjNKwRPpc= sm_transactionid=000000000000000000000000cb2c10000b-0d5d-5c954765-f7fef700-a123457d0351 samaname=mark id=2492341 sm_user=EZ\mark"
| rex ",\"error_message\":\"(?<error_message>.*?)\",\"(?<!\\\\\")"
| rex field=error_message mode=sed "s/\\\\\"/\"/g"

maulikdesai21
Engager

Thanks @woodcock, that works 🙂

However, I having bit hard time understanding the regex, the part below:

| rex max_match=0 ",\"error_message\":\"(?.*?)\",\"(?

0 Karma

woodcock
Esteemed Legend

Click Accept to close the question. Throw the RegEx into RegEx101.com and it will explain all.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...