Splunk Search

Want to extract field from JSON from complex json

gauravkumar85
Explorer

My row data will look like below _row=

{"id":"0","severity":"Information","message":"CPW Total= 844961,SEQ Total =244881, EAS Total=1248892, VRS Total=238, CPW Remaining=74572, SEQ Remaining=22, EAS Remaining =62751, VRS Remaining =0, InvetoryDate =4/15/2024 6:16:07 AM"}

I want to extract fields from message and it will look like below. I tried the through rgex but I am unable to extract. Please help to create extract for 

 

CPW TotalSEQ TotalEAS TotalVRS TotalCPW RemainingSEQ RemainingEAS RemainingVRS RemainingInvetoryDate
844961244881124889223874572226275104/15/2024 6:16:07 AM

 

Labels (4)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Assuming Invetory is spelled (in)correctly, you could try this - the rex at the end is required because this date has an embedded space and it is the last field in the message

| makeresults 
| eval _raw="{\"id\":\"0\",\"severity\":\"Information\",\"message\":\"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM\"}"
| spath
| rename message as _raw
| extract
| rex "InvetoryDate=(?<InvetoryDate>.*)"

If the fields were re-ordered or an extra field was in the message (without an embedded space),  then the rex would not be required

| makeresults 
| eval _raw="{\"id\":\"0\",\"severity\":\"Information\",\"message\":\"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM, Tail=True\"}"
| spath
| rename message as _raw
| extract

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

Your sample data is inconsistently formatted, e.g. sometimes there is a space before/after the =/, Please confirm the exact pattern your data will take so we don't waste effort on invalid data.

gauravkumar85
Explorer

Thank you so much for prompt reply. Below is the fixed format of the data. Please help me on this. 

{"id":"0","severity":"Information","message":"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM"} 

Need to extract fields in below format. Your help really appreciated.  

CPW TotalSEQ TotalEAS TotalVRS TotalCPW RemainingSEQ RemainingEAS RemainingVRS RemainingInvetoryDate
844961244881124889223874572226275104/15/2024 6:16:07 AM
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming Invetory is spelled (in)correctly, you could try this - the rex at the end is required because this date has an embedded space and it is the last field in the message

| makeresults 
| eval _raw="{\"id\":\"0\",\"severity\":\"Information\",\"message\":\"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM\"}"
| spath
| rename message as _raw
| extract
| rex "InvetoryDate=(?<InvetoryDate>.*)"

If the fields were re-ordered or an extra field was in the message (without an embedded space),  then the rex would not be required

| makeresults 
| eval _raw="{\"id\":\"0\",\"severity\":\"Information\",\"message\":\"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM, Tail=True\"}"
| spath
| rename message as _raw
| extract

gauravkumar85
Explorer

{"id":"0","severity":"Information","message":"[{\"TARGETSYSTEM\":\"SEQ\",\"ARUNAME\":\"CPW_02170\",\"TOTAL\":437330,\"PROCESSED\":436669,\"REMAINING\":661,\"ERROR\":0,\"SKIPPED\":112},{\"TARGETSYSTEM\":\"SEQ\",\"ARUNAME\":\"CPW_02171\",\"TOTAL\":78833,\"PROCESSED\":78832,\"REMAINING\":1,\"ERROR\":0,\"SKIPPED\":35},{\"TARGETSYSTEM\":\"SEQ\",\"ARUNAME\":\"CPW_02169H\",\"TOTAL\":100192,\"PROCESSED\":100192,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":20016},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00061\",\"TOTAL\":7,\"PROCESSED\":0,\"REMAINING\":7,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_01015\",\"TOTAL\":9,\"PROCESSED\":0,\"REMAINING\":9,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00011H\",\"TOTAL\":17,\"PROCESSED\":0,\"REMAINING\":17,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00079\",\"TOTAL\":0,\"PROCESSED\":0,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_02191\",\"TOTAL\":0,\"PROCESSED\":0,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_02184\",\"TOTAL\":0,\"PROCESSED\":0,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_07009CS\",\"TOTAL\":0,\"PROCESSED\":0,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":0},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00304\",\"TOTAL\":1318,\"PROCESSED\":1318,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":24},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00314\",\"TOTAL\":6188,\"PROCESSED\":6188,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":1},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00355\",\"TOTAL\":505,\"PROCESSED\":462,\"REMAINING\":43,\"ERROR\":0,\"SKIPPED\":11},{\"TARGETSYSTEM\":\"CPW\",\"ARUNAME\":\"CPW_00364\",\"TOTAL\":12934,\"PROCESSED\":2804,\"REMAINING\":10130,\"ERROR\":0,\"SKIPPED\":1},{\"\":\"EAS\",\"ARUNAME\":\"CPW_02130\",\"TOTAL\":0,\"PROCESSED\":0,\"REMAINING\":0,\"ERROR\":0,\"SKIPPED\":0}]"}

I want below two views from same data

First View:

View_One.PNG

Second view:

gauravkumar85_0-1713389758786.png

Tags (1)
0 Karma

gauravkumar85
Explorer

Thanks for you response. Your solution is working fine and create below query for search. 

index = **** host=***| spath | eval message="{\"message\":".message."}" | spath input=message message{} output=collection | mvexpand collection | spath input=collection | stats sum(TOTAL) as Total, sum(PROCESSED) as Processed sum(SKIPPED) as Skipped by TARGETSYSTEM

I am using above query. Below chart is created using above query. Now I want to display inventory with date in chart. 

gauravkumar85_0-1713775934397.png

I want display like below

gauravkumar85_1-1713776081096.png

 

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

This other chart seems to be related to a different search, particularly as it appears to have a date on the x-axis which does not appear as a column in your search.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming this is supposed to be good JSON (which it isn't) and that you had missed a field name on the last object in the collection, you could try this.

| spath
``` Fix up message to make a valid JSON field ```
| eval message="{\"message\":".message."}"
``` Get the collection from message ```
| spath input=message message{} output=collection
``` Expand the collection into separate events ```
| mvexpand collection
``` Extract the fields ```
| spath input=collection
``` Assume you want the totals by ARUNAME ```
| stats sum(TOTAL) as Total, sum(PROCESSED) as Processed sum(REMAINING) as Remaining sum(ERROR) as Error sum(SKIPPED) as Skipped by ARUNAME

For the first view, you would remove the by clause from the stats command

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Assuming Invetory is spelled (in)correctly, you could try this - the rex at the end is required because this date has an embedded space and it is the last field in the message

If the fields were re-ordered or an extra field was in the message (without an embedded space),  then the rex would not be required


The problem is less embedded space, more lack of embedded quotation marks/proper field separator.  It is semantically more pleasing to fix structure with rex than using rex to extract one data snippet when most are extracted with extract command. (But if you have any influence on developers, beg them to add quotation marks - more on this later.)

 

| rex field=message mode=sed "s/Date=/&\"/ s/$/\"/"
| rename message as _raw
| extract

 

It would give you the same result like


CPWRemaining
CPWTotalEASRemainingEAStatalInvetoryDateSEQRemainingSEQTotalVRSRemainingVRSTotalidseverity
56127498601510628044/16/2024 7:34:25 PM32746102613702380Information

About feedback to developers.  @ITWhisperer gave one option by taking advantage of a side effect/gem feature) from Splunk's extract command by adding a comma at the end of every key-value pair.  They do not have to swap order, but also by simply adding a literal comma after each value, like this:

 

{"id":"0","severity":"Information","message":"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=4/16/2024 7:34:25 PM,"}

 

A more robust fix (that does not rely on Splunk's "generosity") is to properly quote the value.  Any language can extract that without the programmer's attention.

 

{"id":"0","severity":"Information","message":"CPWTotal=749860, SEQTotal=1026137, EASTotal=1062804, VRSTotal=238, CPWRemaining=5612, SEQRemaining=32746, EASRemaining=15, VRSRemaining=0, InvetoryDate=\"4/16/2024 7:34:25 PM\""}

 

The logic should be simple enough: Numeric data, no quote, string data, quote.

 

0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...