Splunk Search

How to Extract JSON format as fields?

karthi2809
Contributor

Need to extract json file in fields

{

"AAA":

{

"modified_files": [ "\"b/C:\\\\/HEAD\"",

"\"b/C:\\\\/dev\"",

"\"b/C:\\\\HEAD\"" ]

},

"BBB":

{

"modified_files": [ "\"b/C:\\\\/HEAD\"",

"\"b/C:\\\\/dev\"",

"\"b/C:\\\\HEAD\"" ]

}

}

Expected Output as:

AAA,BBB is application name

eg:

Application: AAA

Thanks in advance

Labels (1)
Tags (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Is this something you are looking for?

 

| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, I assume that your indexed JSON has already been extracted, i.e., "flattened" into fields like AAA.modified_files{} and BBB.modified_files{}.  If not, add |spath to do so.

The sample input gives the following:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

Is this something you are looking for?

 

| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, I assume that your indexed JSON has already been extracted, i.e., "flattened" into fields like AAA.modified_files{} and BBB.modified_files{}.  If not, add |spath to do so.

The sample input gives the following:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB

karthi2809
Contributor

 

@yuanliu In my json file i have another application which modified field is empty but now the application ccc is not extracting as application.

Thanks 

 

{
"AAA":
{
"modified_files": [ "\"b/C:\\\\/HEAD\"",
                    "\"b/C:\\\\/dev\"",
                    "\"b/C:\\\\HEAD\"" ]
},
"BBB":
{
"modified_files": [ "\"b/C:\\\\/HEAD\"",
                    "\"b/C:\\\\/dev\"",
                    "\"b/C:\\\\HEAD\"" ]
},
"CCC":
{
"modified_files": [
 ]
}
}

 

 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Splunk doesn't output anything with empty array (bug?).  In the past, I used sed to add a distinct string to represent empty arrays.  This will work if you accept the cost of extra extraction.

 

| rex mode=sed "s/\[\s*\]/[\"\"]/"
| spath
| foreach *.modified_files{}
    [ eval application=mvappend(application,"<<MATCHSTR>>" . "|" . mvjoin('<<MATCHSTR>>.modified_files{}', "|"))]
| fields application
| mvexpand application
| eval application = split(application, "|")
| eval application.modified_files=mvindex(application,1,-1), application.name=mvindex(application,0)
| table application.*

 

Here, AAA.modified_files{} and BBB.modified_files{} are already extracted when you do your search, but they are discarded.  They get extracted once more in SPL after you fake the empty array.

Output from sample data:

application.modified_files
application.name
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
AAA
"b/C:\/HEAD"
"b/C:\/dev"
"b/C:\HEAD"
BBB
 CCC
Tags (2)

karthi2809
Contributor

 @yuanliu  perfect Thanks.

0 Karma

karthi2809
Contributor

@yuanliu i have another query by using your query its giving dynamic fields and how can i get the  counts of the dynamic fields.

0 Karma

venky1544
Contributor

@karthi2809  Hey 

it could be that the JSON you pasted here is bit different than your original data. formats usually get changed like an added space or newlines when you copy paste if you could attach a sample file here probably can see why the query is not working and what is the sourcetype of your data is it _json ??

venky1544
Contributor

if you already ingested this data and have the sourcetype as _json 

you can use the following query 

sourcetype=_json |rex max_match=0 field=_raw "\"(?<application>\w+)\":\s\n"

 

venky1544_0-1645706100581.png

 

karthi2809
Contributor

@venky1544 Thanks for the solution .But i am not getting application name as fields.

0 Karma
Get Updates on the Splunk Community!

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...

Reminder! Splunk Love Promo: $25 Visa Gift Card for Your Honest SOAR Review With ...

We recently launched our first Splunk Love Special, and it's gone phenomenally well, so we're doing it again, ...