Solved: Compare two events with multiple key value pairs

rizwan0683 · ‎02-09-2022

I have two events that are semi-colon separated key value pairs. I have applied the extract command to parse the event into key value pairs. The aim is to compare the two events using the key and highlight the differences in values in a table format with the key as header and values as rows

Event 1
35=D; 54=2; 40=1; 11=abc

Event 2
35=G; 54=2; 40=2; 11=xyz

Result
35|40|11
D|1|abc
G|2|xyz

Which function will index my keys so that I may compare their values and report in the above format?

Extraction performed as follows:

<search> | extract pairdelim=";" kvdelim="\=" clean_keys=false

ITWhisperer · ‎02-10-2022

I am not sure I understand, here is a runanywhere example with the additional field still showing up as expected

| makeresults 
| eval _raw="35=D; 54=2; 40=1; 11=abc; 75=hellow
35=G; 54=2; 40=2; 11=xyz; 75=world"
| multikv noheader=t
| fields _raw
| fields - _time
| extract pairdelim=";" kvdelim="\=" clean_keys=f
| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row _raw

As for your other example, all the columns have different values in, so all would be kept

What happens if there is no mapping, what is the field name in this instance?

View solution in original post

rizwan0683 · ‎02-09-2022

This seems to return the two events as _raw only

ITWhisperer · ‎02-09-2022

You did add it after your extract, right?

Please can you share your query?

rizwan0683 · ‎02-09-2022

yes after extract as follows

<base search>
| extract pairdelim=";" kvdelim="\=" clean_keys=false 
| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row

note the base search returns two events

ITWhisperer · ‎02-09-2022

Here is a runanywhere example based on your sample events showing it working

| makeresults 
| eval _raw="35=D; 54=2; 40=1; 11=abc
35=G; 54=2; 40=2; 11=xyz"
| multikv noheader=t
| fields _raw
| fields - _time
| extract pairdelim=";" kvdelim="\=" clean_keys=f
| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row

Are your real events significantly different to your examples?

rizwan0683 · ‎02-10-2022

Firstly apologies, my first response was incorrect. Your code did in fact bring back _raw events as well as the output of comparing the fields and tabulating them. I had missed those as the _raw events were long and took up the width of the screen and I didn't scroll across. So it appears to be working however, the final output is not as expected.

The real life data is slightly different but I don't believe it is affecting the output (e.g. timestamp field)

My actual output looks like this

_time | date_minute | date_second |  tag_35 | tag_52 | tag_60
2022-02-09T16:33:04.956+0000 | 33 | 4 |  G | 20220209-16:33:04.955 | 20220209-16:33:04.955
2022-02-09T16:31:42.611+0000 | 31 | 42 |  D | 20220209-16:31:42.610 | 20220209-16:31:42.610

These are genuine differences however, there are more fields that have differences not being reported here.

notes:

The first 3 columns are expected as the data does differ in those fields.
The headers for last 4 fields are taken from a dictionary I maintain and can access via inputlookup. In this dictionary key "11" has been mapped to a header named "tag_11".

I note that the keys with differences that are not showing up, happen to be those that DON'T have a mapping in this dictionary. Using my original example:

Event 1
35=D; 54=2; 40=1; 11=abc; 75=hellow

Event 2
35=G; 54=2; 40=2; 11=xyz; 75=world

Your code is generating the equivalent of this output:

Result
35|40|11
D|1|abc
G|2|xyz

i.e 75 is omitted, despite having a difference.

Keys 35, 54, 40, 11 are mapped in this dictionary
Key 75 is not.

Would this cause the discrepancy in the result?

ITWhisperer · ‎02-10-2022

I am not sure I understand, here is a runanywhere example with the additional field still showing up as expected

| makeresults 
| eval _raw="35=D; 54=2; 40=1; 11=abc; 75=hellow
35=G; 54=2; 40=2; 11=xyz; 75=world"
| multikv noheader=t
| fields _raw
| fields - _time
| extract pairdelim=";" kvdelim="\=" clean_keys=f
| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row _raw

As for your other example, all the columns have different values in, so all would be kept

What happens if there is no mapping, what is the field name in this instance?

rizwan0683 · ‎02-10-2022

>>What happens if there is no mapping, what is the field name in this instance?

I would expect the key name to display e.g. 35,11,75.

Would it be an issue if the data was arranged randomly e.g.

Event 1
35=D; 54=2; 40=1; 11=abc; 75=hellow

Event 2
54=2; 11=xyz; 75=world; 40=2; 35=G

Can you help breakdown what your code is doing please, specifically

| eval row="row".row

ITWhisperer · ‎02-10-2022

Specifically

| streamstats count as row
| eval row="row".row

Is adding a field called row with a unique number,; this is then prepended with the string "row" so that, when the transpose is done, there are two fields with known names (row1 and row2) for the where command to be able to compare across the two columns (previously rows)

The order doesn't matter because the extract is putting them into the right fields

| makeresults 
| eval _raw="35=D; 54=2; 40=1; 11=abc; 75=hellow
54=2; 11=xyz; 75=world; 40=2; 35=G"
| multikv noheader=t
| fields _raw
| fields - _time
| extract pairdelim=";" kvdelim="\=" clean_keys=f
| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row _raw

ITWhisperer · ‎02-09-2022

For two events, you could do this:

| streamstats count as row
| eval row="row".row
| transpose 0 header_field=row column_name=row
| where row1!=row2
| transpose 0 header_field=row column_name=row
| fields - row

If you have more than 2 events, you will need to be more specific about the comparisons for which fields from the events you want to keep

Compare two events with multiple key value pairs

field extraction

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?