Solved: How to compare search and csv file?

vrmandadi · ‎03-24-2016

Hello ,

I have a search which gives me all the unique hostnames and csv file which also has the hostnames ,but I am trying to compare both and I want three reports

1)to get all the unique hostnames which are there in both search and csv file
2)get the unique host names which are there only in csv file NOT in search
3)get the unique host names which are there in search but NOT in csv file

the problem here is the search has the hostname field as Host_name and the csv file has assestname as hostname and has an extension for few of them
example

host_name(search)

s222pnlp
S5234gnn
S5135

assestname(csv)

s222pnlp.asd
S5234gnn.yuv
S5135

somesoni2 · ‎03-24-2016

Give this a try

your search giving field host_name | eval From="Search" | append [ Your search to get field assetname from csv | eval host_name=mvindex(split(assetname,"."),0) | table host_name | eval From="CSV" ] | stats values(From) as From by host_name
| eval status=case(mvcount(From)=2,"Present in both", From="Search","Only available in search",1=1,"Only available in CSV") | table host_name status

Updated*

 index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
 | append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
 | append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
 | append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
  | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status

To filter the values which are unique to both combined and not present in both, add following to above search

above search | where status!="Present in both"

View solution in original post

sandeepshah81 · ‎06-12-2018

i have outputlookup csv file. which contains Computernames which I want. This is one search which runs everyday at 9AM.
I wanted to write another queryw hich looks this csv file and next day when the outputlookup is run, it does not contain sameComputernames from first csv.

Eg:Lets say my search query results with 5 computernames....this result is got from today's search which I have exportted to output lookup abc.csv....
Tomorrow the sae query when run at 9AM, I wanted to ensure it would compare the first csv file to check Computernames if present , dont show that in next day search. Second day csv file should only have new Computernames which are not present in first csv.

woodcock · ‎03-24-2016

Try this as your base search:

hostnameSearchHere | eval From="Search" | appendpipe [|inputcsv max=0 assetname | eval From="CSVfile"] | rex field=Host_Name mode=sed "s/\.[^.]*$//" | stats dc(From) AS numFroms values(*) AS * by Host_Name

Then tack on what you need:
1: | where numFroms>1
2: | where numFroms=1 AND From="CSVfile"
3: | where numFroms=1 AND From="Search"

vrmandadi · ‎03-24-2016

Hello woodcock ,I am not able to understand what needs to be replaced in the append pipe

| appendpipe [|inputcsv max=0 assetname | eval From="CSVfile"] - yours

| appendpipe [|inputcsv max=0 "Asset Name" | eval From="nexposeallassets.csv"]-mine

and the numForms has only one value 2

woodcock · ‎03-24-2016

You only need to replace assetname with your actual CSV file name. DO NOT CHANGE ANYTHING ELSE (until you see that it works).

vrmandadi · ‎03-24-2016

I can only see the values with numForms value 2

woodcock · ‎03-24-2016

That means that all Host_Name values occur in both data sets. And it is R-then-O, not O-then-R: numFroms.

vrmandadi · ‎03-24-2016

well my csv file has assestname with .abc.int or .auther.inc ,i just want to remove these extensions in the csv in splunk ,can we do that.that will make easy to compare

example

Assestname
s4000cawv.abc.int
achaesdb01.auther.inc

woodcock · ‎03-24-2016

My solution ALREADY DOES THAT. It is this part: | rex field=Host_Name mode=sed "s/\.[^.]*$//"

vrmandadi · ‎03-24-2016

I know you have taken a lot of time for me and helping me ,really thanks for that but Can you please tell me R-then-O, not O-then-R:numForms because I only see NumForm with 1. below is the query which I ran

index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name| append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]| append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name]|appendpipe[ inputlookup nexposeallassets.csv | table "Asset Names"]| appendpipe [|inputcsv max=0 "Asset Names" | eval From="CSVfile"] | rex field=Host_Name mode=sed "s/\.[^.]*$//" | stats dc(From) AS numFroms values(*) AS * by Host_Name

woodcock · ‎03-24-2016

You keep using f-O-R-m and I am using f-R-O-m. If you type it wrong, things will break.

These are the problems that I see with your search:
1(HUGE): Your search is much different than what you described in your question. I see 3 searches and 2 CSV files wherewas you described 1 search and 1 CSV.
2: You are missing a leading pipe (|) character before inputlookup so that segment of search will do nothing (no events).
3: You only have a single | eval From="<something> which is the key to getting the work done that you say that you need (I had | eval From="Search" that you did not use).

somesoni2 · ‎03-24-2016

Give this a try

your search giving field host_name | eval From="Search" | append [ Your search to get field assetname from csv | eval host_name=mvindex(split(assetname,"."),0) | table host_name | eval From="CSV" ] | stats values(From) as From by host_name
| eval status=case(mvcount(From)=2,"Present in both", From="Search","Only available in search",1=1,"Only available in CSV") | table host_name status

Updated*

 index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
 | append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
 | append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
 | append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
  | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status

To filter the values which are unique to both combined and not present in both, add following to above search

above search | where status!="Present in both"

oscarminassian · ‎09-25-2017

Hey Splunkers,

Long time lurker first-time poster. I'm doing something similar but looking up IP addresses for brute force attacks. I'm sure I'm hitting a snag with my: |eval IP=mvindex(Source_address,0)

The .csv is pretty straightforward:

Source_address; count
10.0.0.5; 50
10.0.0.1; 2
10.0.0.4; 4

I get values back with my search but validating them, there are some discrepencies

index=yes success=1 ipaddress=* 
| table ipaddress 
| dedup ipaddress 
| rename ipaddress as IP 
| dedup IP
| eval From=1 
| append 
    [| inputlookup ohyes.csv 
    | table Source_address 
    | eval IP=mvindex(Source_address,0) 
    | table * 
    | eval From=2] 
| stats sum(From) as From by IP 
| eval status=case(From=3, "Present in both", From=1, "Only in search", 1=1, "Only in CSV") 
| where status="Present in both"
| table IP status

Any help would be greatly appreciated!

vrmandadi · ‎03-24-2016

I tried the below query

but the output only gives the hostnames available in search but not the whole list of all unique and csv

example

Host_Name status

s2324wee available

somesoni2 · ‎03-24-2016

I just checked your subsearch from the nexposeallassets.csv and the field name in eval is Host_Name but the one used in table is host_name. The field names are case sensitive and could be the reason you don't see anything from the CSV. Update "| table host_name" with "| table Host_Name" and try again

vrmandadi · ‎03-24-2016

I have changed that one also but I dont see anything from csv all I can see in status is field is only "Only available in search" but not the csv and a whole list of unique hostnames from both of them

somesoni2 · ‎03-24-2016

Give this a try. I changed the | eval Host_Name=mvindex(split("Asset Names","."),0) with | eval Host_Name=mvindex(split('Asset Names',"."),0) and the status logic

index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
| append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
| append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
| append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
 | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status

vrmandadi · ‎03-24-2016

Hey Somesoni
Thanks a lot it worked.But I want the unique from both combined but not the one present in both?

vrmandadi · ‎03-24-2016

Somesoni2

my csv file has hostname with .abc.int or .auther.inc ,i just want to remove these extensions in the csv ,can we do that.that will make easy to compare

example

host_name
s4000cawv.abc.int
achaesdb01.auther.inc

muebel · ‎03-24-2016

HI vrmandadi, I believe that this might be a good use for the "join" command : http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Join

You'll be able to break the results down based on the join type (outer, left, inner).

Please let me know if this answers your question!

vrmandadi · ‎03-24-2016

I tried using the join commands but the problem is that when it does the match the hostnames for csv has extensions for few of them and that is causing duplication,how can that be removed?

How to compare search and csv file?

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Transform your security operations with Splunk Enterprise Security

Splunk Admins and App Developers | Earn a $35 gift card!