Splunk Search

How to Extract JSON format as fields?

karthi2809
Builder

Need to extract json file in fields

{

"AAA":

{

"modified_files": [ "\"b/C:\\\\/HEAD\"",

"\"b/C:\\\\/dev\"",

"\"b/C:\\\\HEAD\"" ]

},

"BBB":

{

"modified_files": [ "\"b/C:\\\\/HEAD\"",

"\"b/C:\\\\/dev\"",

"\"b/C:\\\\HEAD\"" ]

}

}

Expected Output as:

AAA,BBB is application name

eg:

Application: AAA

Thanks in advance

Labels (1)
Tags (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Is this something you are looking for?

 

| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, I assume that your indexed JSON has already been extracted, i.e., "flattened" into fields like AAA.modified_files{} and BBB.modified_files{}.  If not, add |spath to do so.

The sample input gives the following:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

Is this something you are looking for?

 

| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, I assume that your indexed JSON has already been extracted, i.e., "flattened" into fields like AAA.modified_files{} and BBB.modified_files{}.  If not, add |spath to do so.

The sample input gives the following:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB

karthi2809
Builder

 

@yuanliu In my json file i have another application which modified field is empty but now the application ccc is not extracting as application.

Thanks 

 

{
"AAA":
{
"modified_files": [ "\"b/C:\\\\/HEAD\"",
                    "\"b/C:\\\\/dev\"",
                    "\"b/C:\\\\HEAD\"" ]
},
"BBB":
{
"modified_files": [ "\"b/C:\\\\/HEAD\"",
                    "\"b/C:\\\\/dev\"",
                    "\"b/C:\\\\HEAD\"" ]
},
"CCC":
{
"modified_files": [
 ]
}
}

 

 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Splunk doesn't output anything with empty array (bug?).  In the past, I used sed to add a distinct string to represent empty arrays.  This will work if you accept the cost of extra extraction.

 

| rex mode=sed "s/\[\s*\]/[\"\"]/"
| spath
| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, AAA.modified_files{} and BBB.modified_files{} are already extracted when you do your search, but they are discarded.  They get extracted once more in SPL after you fake the empty array.

Output from sample data:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB
 CCC
Tags (2)

karthi2809
Builder

 @yuanliu  perfect Thanks.

0 Karma

karthi2809
Builder

@yuanliu i have another query by using your query its giving dynamic fields and how can i get the  counts of the dynamic fields.

0 Karma

venky1544
Builder

@karthi2809  Hey 

it could be that the JSON you pasted here is bit different than your original data. formats usually get changed like an added space or newlines when you copy paste if you could attach a sample file here probably can see why the query is not working and what is the sourcetype of your data is it _json ??

venky1544
Builder

if you already ingested this data and have the sourcetype as _json 

you can use the following query 

sourcetype=_json |rex max_match=0 field=_raw "\"(?<application>\w+)\":\s\n"

 

venky1544_0-1645706100581.png

 

karthi2809
Builder

@venky1544 Thanks for the solution .But i am not getting application name as fields.

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...