Splunk Search

How to compare two log files

isedrof
Engager

Hello everybody,

I'm working on two log files. The first one 'Collab.csv' seems to be like:

user_name       company    position
bob make        C1         Eng
Alice nelly     C2         Eng
Ashely gerard   C3         HR

And the second one "logapp.csv" has this form:

user_name       user_id    application   sing_in_date   account_validity
bob make        bob.m      app1          10/06/2014     01/05/2015
Alice nelly     alice.n    app2          5/01/2015      01/05/2017
Ashely gerard   ashely.g   app3          07/04/2014     01/05/2016

So the aim here is to retrieve all "users" that exists in the file "logapp.csv" that DO NOT exist in the first file "Collab.csv".
What I have done before is put the first one as a lookup file, and the second one I uploaded via "Add data". I tried all combinations to get results after search, but I was not successful. I'm new to Splunk and I need your help please.

Thank you

Tags (5)
0 Karma
1 Solution

MuS
Legend

Hi isedrof,

a quick and dirty approach is this:

| inputlookup logapp.csv | search NOT [ search source=Collab.csv | dedup user_name | table user_name ]

This will read the log app.csv and searches for all user_names that are not returned by the sub search on the other cvs file.

Hope that helps ...

cheers, MuS

View solution in original post

MuS
Legend

Hi isedrof,

a quick and dirty approach is this:

| inputlookup logapp.csv | search NOT [ search source=Collab.csv | dedup user_name | table user_name ]

This will read the log app.csv and searches for all user_names that are not returned by the sub search on the other cvs file.

Hope that helps ...

cheers, MuS

isedrof
Engager

Hey,
If we want to search in 2 lookups files in the same time instead of searching in one file.
thank you.

0 Karma

MuS
Legend

do it like this:
| inputlookup one | inputlookup append=t two | ...

0 Karma

isedrof
Engager

Thank your very much it works.

0 Karma

isedrof
Engager

Hey,
Please, to make a filter on the "user_name" So that we extarct those that does not start with "WS" for example, and filter "sing_in_date<08/07/2015 " .. how we can do that ?
Thanks again.

0 Karma

MuS
Legend

Regarding the user_name simply but another NOT user_name=ws* somewhere in the search pipe like this:

| inputlookup logapp.csv | search NOT [ search index=* source=Collab.csv NOT user_name=WS* | dedup user_name | table user_name ]

Regarding the sign_in_date this is tricky, because you will need to transform it first into a date value so Splunk can use it an numeric value and not as string. Maybe this will do the trick?

 | inputlookup logapp.csv | search NOT [ search index=* source=Collab.csv NOT user_name=WS* | dedup user_name | table user_name ] | eval sign_in_date=strptime(sign_in_date, "%m/%d/%Y") | search sing_in_date<08/07/2015

This is untested!

0 Karma

isedrof
Engager

Thank you for ur answer, just one more question which one i put as a lookup file and a data (index) ?

0 Karma

MuS
Legend

Basically you could setup both cvs files as lookups and use them with inputlookup or index both cvs files to some index and use this index in your search.

0 Karma

isedrof
Engager

i'm sorry to bother you with much questions but i'm new in splunk. so i'll say to you what i did until now.
I added the first file "Collab.csv" into data (i guess this is what we call index in splunk language). after that i added the second file into Lookups (Lookups file and definition). and i'm trying now your approach by typing it on the search bar. it's correct what i have done until now ?

0 Karma

MuS
Legend

Try this:

| inputlookup logapp.csv | search NOT [ search index=* source=Collab.csv | dedup user_name | table user_name ]
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...