Splunk Search

How to compare two inputlook kv columns?

akshaycloud11
Loves-to-Learn Lots

Hi There, 
we have two inputlook kv (File1 and File2)  files and I want to compare 3 columns (AvsA, BvsB, CvsC) between each file and display the result where we clouldn't find any value in all the 3 columns( like non of the comparison written any value I mean blank). 

Eg. File1 has Column A which contains roll numbers, Column B contains the Name and Column C contains the Registration number. File2 has the similar columns therefore we need to compare the information (AvsA, BvsB, CvsC) and find out where we have blanks (like there were no results for all the comparison AvsA, BvsB and CvsC) and report them in a table unique once. 

Labels (1)
0 Karma

akshaycloud11
Loves-to-Learn Lots

The query is giving false results ( I mean, the query is reporting the data which is present in File 2)
let me explain the problem one more time. 

File 1

Row#roll numbersName Registration #
15  
27Ajay999
3  123
410  
5 Vijay 

 
File2

Row#Classroll numbersSectionNameRegistration #
1V2AAaron565
2VI4BMichel321
3IV3DJeff678
4VIII7EAjay999
5X8HKumar767
6XII10F 098
7XI12NEvan 345

 


now, I want to compare the following columns:
Roll numbers(File1) against Roll numbers(File2)
Name(File1) against the Name(File2)
Registration #(File1) against the Registration#(File2)

if we find any one of the record File1 in File2 then we will not report that, I shall only report the record which we couldn't find by roll number/name/registration#

Note - We have to compare FILE1 against FILE 2 and report the FILE1 missing records in File2

Desired OUTPUT 

Row #roll numbersNameRegistration #
15  
2  123
3 Vijay 

 

Total 3 records not found (present in File1 and not present in File2)

0 Karma

yuanliu
SplunkTrust
SplunkTrust

This should be a lesson for anyone asking question in this forum: Explain the raw data, describe important features such as blank or null values, which fields accompany which type of events, etc.  We could have saved a ton of time and energy had you

  1. explained that some of your A, B, and C can be blank or null, as well as rules (logic) to deal with them, and
  2. the fact that you are trying to exclude matching of ANY of A, B, or C, not all of A, B, and C.

What may seem "obvious" to you cannot be obvious to volunteers who lack intimate knowledge about your data and use case.

If I take data from your last post, A -> Name, B -> Registration #, and C -> roll numbers, I see lots of them have a single blank character (" ") as value in both File 1 and File 2.  Do you count them as "found"?  Based on your desired output, you seem to deem them "not found" in File 2.

If this interpretation is correct, and if your special repeating values are single blank character " ", the following should give you desired output:

 

| inputlookup File1
| eval kv = "File1"
| append
  [| inputlookup File2
  | eval kv = "File2"]
| eventstats dc(kv) as lookupcount values(kv) as lookup by Name
| where lookupcount == 1 OR Name == " "
| eventstats dc(kv) as lookupcount values(kv) as lookup by Registration__
| where lookupcount == 1 OR Registration__ == " "
| eventstats dc(kv) as lookupcount values(kv) as lookup by roll_numbers
| where (lookupcount == 1 OR roll_numbers == " ") AND lookup == "File1"

 

To help visualize, the following is an emulation of the two inputlookup appends based on sample data you posted:

 

| makeresults
| eval _raw = "
Row#	roll numbers	Name 	Registration #
1	5	 	 
2	7	Ajay	999
3	 	 	123
4	10	 	 
5	 	Vijay	 "
| multikv
| rename Name_ as Name
| eval kv = "File1"
| fields - _time _raw linecount
| append
    [| makeresults
    | eval _raw ="
Row#	Class	roll numbers	Section	Name	Registration #
1	V	2	A	Aaron	565
2	VI	4	B	Michel	321
3	IV	3	D	Jeff	678
4	VIII	7	E	Ajay	999
5	X	8	H	Kumar	767
6	XII	10	F	 	098
7	XI	12	N	Evan	 345"
    | multikv
    | fields - _time _raw linecount
    | eval kv = "File2"]
``` the above simulates two inputlookups appended together ```

 

You can play with this emulation and compare with your actual inputlookups.  Put this emulation with cascaded eventstats-where from the above code, i.e.,

 

| eventstats dc(kv) as lookupcount values(kv) as lookup by Name
| where lookupcount == 1 OR Name == " "
| eventstats dc(kv) as lookupcount values(kv) as lookup by Registration__
| where lookupcount == 1 OR Registration__ == " "
| eventstats dc(kv) as lookupcount values(kv) as lookup by roll_numbers
| where (lookupcount == 1 OR roll_numbers == " ") AND lookup == "File1"

 

I get

ClassNameRegistration__Row_Sectionkvlookuplookupcountroll_numbers
   1 File1File115
  1233 File1File11 
 Vijay 5 File1File11 

Hope this helps

0 Karma

yuanliu
SplunkTrust
SplunkTrust

By "comparison" I assume you mean an equality test, where a false value is considered "blank".  If this is correct, distinct_count, or dc, is your answer.  This is something you can try

| inputlookup File1 ``` OUTPUT A B C ```
| eval kv = File1
| append
    [| inputlookup File2 ``` OUTPUT A B C ```
    | eval kv = File2]
| stats dc(kv) as lookupcount values(kv) as lookup by A B C
| where lookupcount = 1

Hope this helps.

Tags (1)

akshaycloud11
Loves-to-Learn Lots

Hi There, 

Below are the issues I am facing while trying the solution 

1. dc(kv) isn't working in lookupfile. Tried multiple ways and looks like it isn't supporting the lookupfile. 
2. We need to compare File 1 data(it has 10K records) with File2(it has 30K) records and figure out the data which is  present in File1 and not present in File 2 based on lookup by 3 columns (A,B,C) and report the unique only records. We need to report File1 missing data in File2. 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

My mistake.  File1 and File2 needs to be quoted.

| inputlookup File1 ``` OUTPUT A B C ```
| eval kv = "File1"
| append
    [| inputlookup File2 ``` OUTPUT A B C ```
    | eval kv = "File2"]
| stats dc(kv) as lookupcount values(kv) as lookup by A B C
| where lookupcount = 1 AND kv == "File1"
0 Karma

akshaycloud11
Loves-to-Learn Lots

Hi @yuanliu  

when we run the query, the second part of the where condition( kv == "File1") doesn't return any results. 

| where lookupcount = 1 AND kv == "File1"

where condition works for lookupcount =1 and shows the File2 records but we need to find the results of File1 (unique values only). Kindly help. 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

You are correct that inputlookup command doesn't have an "OUTPUT" option; the tree back ticks ("`") opens a comment that is closed by three back ticks.  These comments are meant to highlight that I expect both output from File1 and from File2 to contain A, B, and C.

Now, if the condition lookupcount = 1 AND kv == "File1" gives you no results but lookupcount = 1 AND kv == "File2" returns some thing, that logically means that every combination of A, B, and C that appears in File1 is found in File2, whereas there are some unique combinations in File2.  One way to examine data quickly is to run

| inputlookup File1 ``` OUTPUT A B C ```
| eval kv = "File1"
| append
    [| inputlookup File2 ``` OUTPUT A B C ```
    | eval kv = "File2"]
| stats dc(kv) as lookupcount values(kv) as lookup by A B C
| sort lookupcount

You can see which lookup table have count of one for which combinations of A, B, C.

0 Karma

akshaycloud11
Loves-to-Learn Lots

I also want to understand what you mean by ```OUTPUT A B C`` because inputlookup doesn't support OUTPUT. 

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...