Splunk Search

Parsing a JSON string in search object

bshega
Explorer

We changed how our data was getting into splunk instead of dealing with full JSON we're just importing the data straight from the database. We have a dashboard that lets our consumer services team search by address, we're using spath currently to parse the JSON. We don't have to do that anymore with the new format but the additional_information part of our object is still JSON, how can I parse this?

This is that the data looks like

2017-01-04 16:41:34.439, id="60", created_at="2017-01-04 16:41:34.43926", updated_at="2017-01-04 16:41:34.43926", email="", encrypted_password="", token="", first_name="Brandon", last_name="Shega", additional_info=""{\"address_2\":\"\",\"city\":\"North Olmsted\",\"zip_code\":\"44070\",\"country\":\"us\",\"address_1\":\"23500 Al Moen Dr\",\"state\":\"Ohio\"}""

So additional_info is really a string that contains a JSON object, is this possible to be parsed to be able to create a dropdown with the full address and search on that?

Tags (1)
1 Solution

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

somesoni2
Revered Legend

You should be able to use | spath input=additional_info to parse that embedded json data and extract fields. If those escaped double quotes are causing issue with spath, you may have to correct it before using spath (either by eval-replace or rex-sed).

0 Karma

bshega
Explorer

I actually noticed something else, no matter what I do, Splunk always puts the additional_info as blank, I'm thinking it's because it's a string within a string (notice the double quotes at the beginning and end). So I think Splunk is just taking the first set of quotes and saying that it's an empty string. Is there a way to massage the data already in Splunk to remove the outer set of quotes?

0 Karma

bshega
Explorer

So, I've got something like this

index=iot-productiondb source=Users | eval additional_info = replace(additional_info, "\\/", "") | eval additional_info = ltrim(additional_info, "\"") | eval additional_info = rtrim(additional_info, "\"") | spath input=additional_info | table additional_info

I'm mostly just trying to make sure the JSON is parsing correctly with the table command. Does this look correct? I'm not sure how to actually access the information after the spath. I've tried something like additional_info.country

0 Karma
Get Updates on the Splunk Community!

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

In today’s evolving threat landscape, we understand you’re constantly bombarded with phishing and malware ...

AppDynamics is now part of Splunk Ideas

Hello Splunkers, We have exciting news for you! AppDynamics has been added to the Splunk Ideas Portal. Which ...

Advanced Splunk Data Management Strategies

Join us on Wednesday, May 14, 2025, at 11 AM PDT / 2 PM EDT for an exclusive Tech Talk that delves into ...