Solved: Getting blank rows when comparing two JSON data se...

splunk_dev · ‎03-18-2024

Hi ,

I am comparing two JSON data sets with respect to values of some nested keys in them.

The comparison is working fine except that at the end I am getting some blank rows with no data for them in the columns except the diff column that I am inserting.

I am including the query that I am using. However, since I am using appendcols in this, so the data sets returned by the search commands would be as below respectively:

data1={
\"Sugar\": {
\"prod_rate\" : \"50\",
\"prod_qual\" : \"Good\"
},
\"Rice\": {
\"prod_rate\" : \"80\",
\"prod_qual\" : \"OK\"
},
\"Potato\": {
\"prod_rate\" : \"87\",
\"prod_qual\" : \"OK\"
}
}

data2="{
\"Sugar\": {
\"prod_rate\" : \"50\",
\"prod_qual\" : \"Good\"
},
\"Wheat\": {
\"prod_rate\" : \"50\",
\"prod_qual\" : \"Good\"

}
}"

The actual query with proper search command in place is actually returning some blank rows. How can I remove them from display ?

index = data1

| eval grain_name = json_array_to_mv(json_keys(data1))
|mvexpand grain_name
|eval data = json_extract(data1, grain_name), qual = json_extract(data, "prod_qual")
|table grain_name, qual
| appendcols [ search index=data2| eval grain_name2 = json_array_to_mv(json_keys(data2))
| mvexpand grain_name2
| eval data2 = json_extract(data2, grain_name2), qual2 = json_extract(data2, "prod_qual")]
|eval diff = if(match (qual, qual2), "Same", "NotSame")
|table grain_name, qual, diff

ITWhisperer · ‎03-18-2024

The appendcols command just appends columns in whatever order they are returned in with no respect for the values in the "common" fields, so, your comparison is like comparing apples with oranges, or in your case wheat with rice. Is this what you really want to do?

You could use stats to combine the values from each search like this

index = data1

| eval grain_name = json_array_to_mv(json_keys(data1))
|mvexpand grain_name
|eval data = json_extract(data1, grain_name), qual = json_extract(data, "prod_qual")
|table grain_name, qual
| append [ search index=data2
| eval grain_name = json_array_to_mv(json_keys(data2))
| mvexpand grain_name
| eval data2 = json_extract(data2, grain_name), qual2 = json_extract(data2, "prod_qual")]
| stats values(qual) as qual values(qual2) as qual2 by grain_name
|eval diff = if(match (qual, qual2), "Same", "NotSame")
|table grain_name, qual, diff

View solution in original post

ITWhisperer · ‎03-18-2024

The appendcols command just appends columns in whatever order they are returned in with no respect for the values in the "common" fields, so, your comparison is like comparing apples with oranges, or in your case wheat with rice. Is this what you really want to do?

You could use stats to combine the values from each search like this

index = data1

| eval grain_name = json_array_to_mv(json_keys(data1))
|mvexpand grain_name
|eval data = json_extract(data1, grain_name), qual = json_extract(data, "prod_qual")
|table grain_name, qual
| append [ search index=data2
| eval grain_name = json_array_to_mv(json_keys(data2))
| mvexpand grain_name
| eval data2 = json_extract(data2, grain_name), qual2 = json_extract(data2, "prod_qual")]
| stats values(qual) as qual values(qual2) as qual2 by grain_name
|eval diff = if(match (qual, qual2), "Same", "NotSame")
|table grain_name, qual, diff

splunk_dev · ‎03-18-2024

You are correct, I want to compare apple with apple (or sugar with sugar).

Your query has removed the blank rows.

But the comparison is failing. It is saying every thing is Notsame.

However here the sugar prod_qual is same. In the real data sets there are many values which are same and few are not same.

Thanks

ITWhisperer · ‎03-18-2024

The problem with fake, made-up data is when it does not accurately represent your real data. We can only provide solutions based on the data you have given. If it does not represent your data closely enough, our solutions may not work with your actual data. Please try to provide representative examples (anonymised as appropriate) which demonstrate why the proposed solution does not work for you.

splunk_dev · ‎03-18-2024

Well, the data is little simplified but the structure is same.

Actually , I am trying to compare not only prod_qual but other values as well.

Now it is working

I am comparing both prod_qual and prod_rate hence I have modified your query as below,

index = data1

| eval grain_name = json_array_to_mv(json_keys(data1))
|mvexpand grain_name
|eval data = json_extract(data1, grain_name), rate= json_extract(data, "prod_rate"), qual = json_extract(data, "prod_qual")
|table grain_name, qual
| append [ search index=data2
| eval grain_name = json_array_to_mv(json_keys(data2))
| mvexpand grain_name
| eval data2 = json_extract(data2, grain_name),rate2= json_extract(data2, "prod_rate") qual2 = json_extract(data2, "prod_qual")]
| stats values(qual) as qual values(qual2) as qual2  values(rate) as rate values(rate2) as rate2 by grain_name
|eval diff = if(match (qual, qual2), "Same", "NotSame")
|eval diff2 = if(diff == "Same", if(rate==rate2, "Same", "NotSame"), "NotSame")
|table grain_name, qual, rate, diff2

Getting blank rows when comparing two JSON data sets with some common and unique keys in them

table

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Stay Connected: Your Guide to July Tech Talks, Office Hours, and Webinars!

Are you a member of the Splunk Community?

Getting blank rows when comparing two JSON data sets with some common and unique keys in them

table

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Stay Connected: Your Guide to July Tech Talks, Office Hours, and Webinars!