Solved: Re: How to find and show unique and missing keys b...

splunk_dev · ‎01-29-2024

Hi ,

I have two sets of JSON data. I want to find the keys which are unique in one dataset and also keys which are missing in the same in comparison with the other dataset.

My first data set looks as below :

{ 
    "iphone": {
        "price" : "50",
        "review" : "Good"
    },
    "desktop": {
        "price" : "80",
        "review" : "OK"
    },
   "laptop": {
        "price" : "90",
        "review" : "OK"
    }
}

My second data set looks as below :

{ 
    "tv": {
        "price" : "50",
        "review" : "Good"
    },
    "desktop": {
        "price" : "60",
        "review" : "OK"
    }
}

Therefore, for the first data set (w.r.t second data set):

unique values will be : iphone and laptop

and missing values will be : tv

How can I find out this difference and show then in a table with columns like "uniq_value" and "missing_value"

I could only write the query up to this , but this is half part and not what I want:

index=product_db |
|eval p_name=json_array_to_mv(json_keys(_raw))
|eval p_name = mvfilter(NOT match(p_name, "uploadedBy") AND NOT match(p_name, "time")
| mvexpand p_name| table p_name

Thanks

bowesmana · ‎01-30-2024

You don't really need appendcols, which will make the search longer

index=product_db time="1706589466.725491" OR time="1705566003.777518" 
| streamstats c
| eval p_name_{c}=json_array_to_mv(json_keys(_raw)) 
| stats values(p_name_*) as p_name_*
| eval p_unique = mvmap(p_name_1, if(isnull(mvfind(p_name_2, "^".p_name_1."$")), p_name_1, null())) 
| eval p_missing = mvmap(p_name_2, if(isnull(mvfind(p_name_1, "^".p_name_2."$")), p_name_2, null())) 
| table p_unique p_missing

Assuming you only have two events, that should work

Did your attempt work?

View solution in original post

ITWhisperer · ‎01-29-2024

You could do something like this:

| makeresults format=json data="[{ 
    \"iphone\": {
        \"price\" : \"50\",
        \"review\" : \"Good\"
    },
    \"desktop\": {
        \"price\" : \"80\",
        \"review\" : \"OK\"
    },
   \"laptop\": {
        \"price\" : \"90\",
        \"review\" : \"OK\"
    }
},{ 
    \"tv\": {
        \"price\" : \"50\",
        \"review\" : \"Good\"
    },
    \"desktop\": {
        \"price\" : \"60\",
        \"review\" : \"OK\"
    }
}]"
| fields _raw _time
| eval p_name=json_array_to_mv(json_keys(_raw))
| streamstats count as row
| eval flag = pow(2, row - 1)
| mvexpand p_name
| eval {p_name}=flag
| fields - flag row p_name
| stats sum(*) as *

Fields with 1 are only in the first event, fields with 2 are only in the second event (missing from the first event), and fields with 3 are in both events. This also works for more events as the sums are essentially binary flags for which events the fields come from, e.g. for 3 events, 7 would be all event, 5 would be first and third. etc.

bowesmana · ‎01-29-2024

It's not clear if your two datasets are in the same event or in different events. Generally you can use mvmap to compare multivalue fields in the same event for unique/missing, see this example.

| makeresults 
| eval data1="{ 
    \"iphone\": {
        \"price\" : \"50\",
        \"review\" : \"Good\"
    },
    \"desktop\": {
        \"price\" : \"80\",
        \"review\" : \"OK\"
    },
   \"laptop\": {
        \"price\" : \"90\",
        \"review\" : \"OK\"
    }
}", data2="{ 
    \"tv\": {
        \"price\" : \"50\",
        \"review\" : \"Good\"
    },
    \"desktop\": {
        \"price\" : \"60\",
        \"review\" : \"OK\"
    }
}"
| eval p_name_1=json_array_to_mv(json_keys(data1))
| eval p_name_2=json_array_to_mv(json_keys(data2))
| eval p_unique = mvmap(p_name_1, if(isnull(mvfind(p_name_2, "^".p_name_1."$")), p_name_1, null()))
| eval p_missing = mvmap(p_name_2, if(isnull(mvfind(p_name_1, "^".p_name_2."$")), p_name_2, null()))
| table data1 data2 p_unique p_missing

splunk_dev · ‎01-29-2024

Hi ,

Sorry , I missed to mention if they are of same event or different.

The answer is they are from different events.

The standalone query with makeresults is working as expected.

I used the main part of your query with mvmap and tweaked it as below :

index=product_db time="1706589466.725491" | eval data1=if(match(time,"1706589466.725491"),_raw,0)| eval p_name_1=json_array_to_mv(json_keys(data1))|table p_name_1 |appendcols [ search index=product_db time="1705566003.777518" |eval data2=if(match(time,"1705566003.777518"),_raw,0)
| eval p_name_2=json_array_to_mv(json_keys(data2)) ]
| eval p_unique = mvmap(p_name_1, if(isnull(mvfind(p_name_2, "^".p_name_1."$")), p_name_1, null()))
| eval p_missing = mvmap(p_name_2, if(isnull(mvfind(p_name_1, "^".p_name_2."$")), p_name_2, null()))
| table p_unique p_missing

Please let me know if you have any better suggestion to write this query.

Thanks

bowesmana · ‎01-30-2024

You don't really need appendcols, which will make the search longer

index=product_db time="1706589466.725491" OR time="1705566003.777518" 
| streamstats c
| eval p_name_{c}=json_array_to_mv(json_keys(_raw)) 
| stats values(p_name_*) as p_name_*
| eval p_unique = mvmap(p_name_1, if(isnull(mvfind(p_name_2, "^".p_name_1."$")), p_name_1, null())) 
| eval p_missing = mvmap(p_name_2, if(isnull(mvfind(p_name_1, "^".p_name_2."$")), p_name_2, null())) 
| table p_unique p_missing

Assuming you only have two events, that should work

Did your attempt work?

splunk_dev · ‎01-30-2024

Hi ,

Yes, my attempt with appendcols work.

But your one too works very well.

Since you said my one takes longer search compared to yours optimized one, hence, will use your ones.

Thanks much for suggesting this better and optimized query.

Thanks

How to find and show unique and missing keys between two JSON object in Splunk?

table

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

Are you a member of the Splunk Community?

How to find and show unique and missing keys between two JSON object in Splunk?

table

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25