Splunk Search

extracting fields from horrible json events

Cuyose
Builder

So I have some ugly things to deal with. We will eventually fix the logging, but until that time I am left holding the bag dealing and reporting on this stuff.

I have example events like the following. What I need to do is extract each of the "json" elements. However these events are not valid son due to the escape characters in the json. Splunk's new field extractor took away the ability to identify multiple values and intelligently try and create a regex to match, so that option is gone. Seeing that a regex for each of these extractions is probably a bit easier to develop than landing someone on Mars, I come to the community for help.

Ultimately I would like to be able to search for all the events like the following example and click "table" view and have columns for each "json" element.
2014-10-29T19:20:36+00:00 DEBUG (7): ERP_SERVICE_CALL:POST:RESPONSE: "{\/"status\/":\/"success\/",\/"code\/":400,\/"data\/":{\/"batch_id\/":\/"M-1331\/",\/"order_total\/":4,\/"success_total\/":0,\/"orders\/":[{\/"order_id\/":\/"1272749\/",\/"status\/":\/"error\/",\/"message\/":\/"order_id: 1272749 \/\/nCode: INVALID_KEY_OR_REF\/\/nDetails: Invalid item reference key. Item value provided: ASB-000219 \/\/nforEach(EC_Libs-4.0.6.js:70),forEach(EC_Libs-4.0.6.js:70),restletwrapper(null$lib:4) \/\/n[no stack trace]\/",\/"customer_internal_id\/":\/"16873\/",\/"customer_id\/":1301051},{\/"order_id\/":\/"1272750\/",\/"status\/":\/"error\/",\/"message\/":\/"order_id: 1272750 \/\/nCode: INVALID_KEY_OR_REF\/\/nDetails: Invalid item reference key. Item value provided: ASB-000219 \/\/nforEach(EC_Libs-4.0.6.js:70),forEach(EC_Libs-4.0.6.js:70),restletwrapper(null$lib:4) \/\/n[no stack trace]\/",\/"customer_internal_id\/":\/"16873\/",\/"customer_id\/":1301051},{\/"order_id\/":\/"1272751\/",\/"status\/":\/"error\/",\/"message\/":\/"order_id: 1272751 \/\/nCode: INVALID_KEY_OR_REF\/\/nDetails: Invalid item reference key. Item value provided: ASB-000219 \/\/nforEach(EC_Libs-4.0.6.js:70),forEach(EC_Libs-4.0.6.js:70),restletwrapper(null$lib:4) \/\/n[no stack trace]\/",\/"customer_internal_id\/":\/"16873\/",\/"customer_id\/":1301051},{\/"order_id\/":\/"1272752\/",\/"status\/":\/"error\/",\/"message\/":\/"order_id: 1272752 \/\/nCode: INVALID_KEY_OR_REF\/\/nDetails: Invalid item reference key. Item value provided: ASB-000219 \/\/nforEach(EC_Libs-4.0.6.js:70),forEach(EC_Libs-4.0.6.js:70),restletwrapper(null$lib:4) \/\/n[no stack trace]\/",\/"customer_internal_id\/":\/"16873\/",\/"customer_id\/":1301051}]}}"

Tags (2)
0 Karma

Raghav2384
Motivator

alt text

0 Karma

Raghav2384
Motivator

I am not well versed with json format or what fields to extract. I loaded the sample event you posted to a lookup and broke it almost in to key values. Couldn't proceed due to my lack of knowledge with json. Hope this helps.

|inputlookup json.csv|rex field=System1 mode=sed "s/[-|.|\"|\/]//g"|makemv delim="," System1|mvexpand System1

Thanks,
Raghav

0 Karma

Cuyose
Builder

I guess I am not sure how to implement this. This is a regular indexed event, not a lookup. I need to parse in real time the elements in the data. I tried to edit your command in a search argument replacing System1 with _raw, as that is the event data, and nothing extracted for me.
|rex field=System1 mode=sed "s/[-|.|"|/]//g"|makemv delim="," System1|mvexpand System1

0 Karma

Raghav2384
Motivator

Right, instead of |inputlookup, it would be your base search and System1 would be _raw. I just posted a pic as editor doesn't let me type the exact rex.Hope this help
Pardon me if i am going tangents.

0 Karma

Cuyose
Builder

I saw the missing \'s from your image, and added those, however it still did not extract correctly. The only thing extracted was
20141101T16:01:16+00:00 DEBUG (7): ERP_SERVICE_CALL:POST:RESPONSE: {\status:\success\

for every event

0 Karma

Cuyose
Builder

Sorry, this still does not work. I get this error when tacking that onto my search string
Error in 'SearchParser': Missing a search command before '/'. Error at position '88' of search query 'search index=test3 eventtype=NETSUITE_RESPONSE_ERR...{snipped} {errorcontext = "s/[-|.|"|/]//g"|make}'.

0 Karma

Raghav2384
Motivator

Do you mind posting the search?

0 Karma

Cuyose
Builder

Sure,

index=test3 eventtype=NETSUITE_RESPONSE_ERROR NOT "restletwrapper)\"|rex field=_raw mode=sed "s/[-|.|\"|\/]//g"|makemv delim="," _raw|mvexpand _raw

0 Karma

jayannah
Builder

Hi, the JSON syntax is invalid. Please validate your JSON data using any utility available, such as http://jsonlint.com/

So option for you is to extract the required fields using regex or splunk field extraction utility.

jayannah
Builder

Some problem with editor when saving the rex..

I saved the rex for you at
http://regex101.com/r/xI6tN4/1
http://regex101.com/r/pR4kY5/1
http://regex101.com/r/pR4kY5/2

You can use Splunk field extraction utility...

0 Karma

Cuyose
Builder

must be a SPlunk bug, I can see your extraction in regex101, but splunk fails to parse that regex to find matches. Still trying to figure out how Splunk can extract these.

batch_id\/\":\/\"(?[^\/]+)

0 Karma

jayannah
Builder

ok. Then extract the fields individually.

E.g:
batch_id\/\":\/\"(?[^\/]+) ==> will get the value of batch_id and store in the same name.
success_total\/\":(?[\d]+) ==> Will get the value of success_total and store in the same name

You can be use inline extraction from Splunk field extraction utility and store them. So that you don't need to write these extraction in the search query..

0 Karma

Cuyose
Builder

Those extractions do not work with those events, editing the regex finds nothing
batch_id/":/"(?[^/]+)

0 Karma

Cuyose
Builder

Uh, I specifically called out it was not valid json. I need to extract it anyhow, hense the issue. I cannot change the event.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...