Splunk Search

Parsing a JSON string in search object

bshega
Explorer

We changed how our data was getting into splunk instead of dealing with full JSON we're just importing the data straight from the database. We have a dashboard that lets our consumer services team search by address, we're using spath currently to parse the JSON. We don't have to do that anymore with the new format but the additional_information part of our object is still JSON, how can I parse this?

This is that the data looks like

2017-01-04 16:41:34.439, id="60", created_at="2017-01-04 16:41:34.43926", updated_at="2017-01-04 16:41:34.43926", email="", encrypted_password="", token="", first_name="Brandon", last_name="Shega", additional_info=""{\"address_2\":\"\",\"city\":\"North Olmsted\",\"zip_code\":\"44070\",\"country\":\"us\",\"address_1\":\"23500 Al Moen Dr\",\"state\":\"Ohio\"}""

So additional_info is really a string that contains a JSON object, is this possible to be parsed to be able to create a dropdown with the full address and search on that?

Tags (1)
1 Solution

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

niketn
Legend

@bshega, please try the following search

 index=iot-productiondb source=Users
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

Following is a run anywhere search to extract JSON data using rex (first _raw data is cleaned up using replace() function). Then additional_info field is extracted from _raw event using rex command. Finally spath is applied on the additonal_info field:

|  makeresults
|  eval _raw=" 2017-01-04 16:41:34.439, id=\"60\", created_at=\"2017-01-04 16:41:34.43926\", updated_at=\"2017-01-04 16:41:34.43926\", email=\"\", encrypted_password=\"\", token=\"\", first_name=\"Brandon\", last_name=\"Shega\", additional_info=\"\"{\\\"address_2\\\":\\\"\\\",\\\"city\\\":\\\"North Olmsted\\\",\\\"zip_code\\\":\\\"44070\\\",\\\"country\\\":\\\"us\\\",\\\"address_1\\\":\\\"23500 Al Moen Dr\\\",\\\"state\\\":\\\"Ohio\\\"}\"\""
|  eval _raw=replace(_raw,"\\\\\"","\"")
|  rex "additional_info=\"\"(?<additional_info>[^}]+})"
|  spath input=additional_info

PS: Sample data generate using | makeresults and |eval raw=.... has escaped characters in order to generate raw data as per the question.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

somesoni2
Revered Legend

You should be able to use | spath input=additional_info to parse that embedded json data and extract fields. If those escaped double quotes are causing issue with spath, you may have to correct it before using spath (either by eval-replace or rex-sed).

0 Karma

bshega
Explorer

I actually noticed something else, no matter what I do, Splunk always puts the additional_info as blank, I'm thinking it's because it's a string within a string (notice the double quotes at the beginning and end). So I think Splunk is just taking the first set of quotes and saying that it's an empty string. Is there a way to massage the data already in Splunk to remove the outer set of quotes?

0 Karma

bshega
Explorer

So, I've got something like this

index=iot-productiondb source=Users | eval additional_info = replace(additional_info, "\\/", "") | eval additional_info = ltrim(additional_info, "\"") | eval additional_info = rtrim(additional_info, "\"") | spath input=additional_info | table additional_info

I'm mostly just trying to make sure the JSON is parsing correctly with the table command. Does this look correct? I'm not sure how to actually access the information after the spath. I've tried something like additional_info.country

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...