Getting Data In

How to compare search and csv file?

Builder

Hello ,

I have a search which gives me all the unique hostnames and csv file which also has the hostnames ,but I am trying to compare both and I want three reports

1)to get all the unique hostnames which are there in both search and csv file
2)get the unique host names which are there only in csv file NOT in search
3)get the unique host names which are there in search but NOT in csv file

the problem here is the search has the hostname field as Host_name and the csv file has assestname as hostname and has an extension for few of them
example

host_name(search)

s222pnlp
S5234gnn
S5135

assestname(csv)

s222pnlp.asd
S5234gnn.yuv
S5135

Tags (3)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Give this a try

your search giving field host_name | eval From="Search" | append [ Your search to get field assetname from csv | eval host_name=mvindex(split(assetname,"."),0) | table host_name | eval From="CSV" ] | stats values(From) as From by host_name
| eval status=case(mvcount(From)=2,"Present in both", From="Search","Only available in search",1=1,"Only available in CSV") | table host_name status

Updated*

 index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
 | append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
 | append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
 | append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
  | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status

To filter the values which are unique to both combined and not present in both, add following to above search

above search | where status!="Present in both"

View solution in original post

Explorer

i have outputlookup csv file. which contains Computernames which I want. This is one search which runs everyday at 9AM.
I wanted to write another queryw hich looks this csv file and next day when the outputlookup is run, it does not contain sameComputernames from first csv.

Eg:Lets say my search query results with 5 computernames....this result is got from today's search which I have exportted to output lookup abc.csv....
Tomorrow the sae query when run at 9AM, I wanted to ensure it would compare the first csv file to check Computernames if present , dont show that in next day search. Second day csv file should only have new Computernames which are not present in first csv.

0 Karma

Esteemed Legend

Try this as your base search:

hostnameSearchHere | eval From="Search" | appendpipe [|inputcsv max=0 assetname | eval From="CSVfile"] | rex field=Host_Name mode=sed "s/\.[^.]*$//" | stats dc(From) AS numFroms values(*) AS * by Host_Name

Then tack on what you need:
1: | where numFroms>1
2: | where numFroms=1 AND From="CSVfile"
3: | where numFroms=1 AND From="Search"

0 Karma

Builder

Hello woodcock ,I am not able to understand what needs to be replaced in the append pipe

| appendpipe [|inputcsv max=0 assetname | eval From="CSVfile"] - yours

| appendpipe [|inputcsv max=0 "Asset Name" | eval From="nexposeallassets.csv"]-mine

and the numForms has only one value 2

0 Karma

Esteemed Legend

You only need to replace assetname with your actual CSV file name. DO NOT CHANGE ANYTHING ELSE (until you see that it works).

0 Karma

Builder

I can only see the values with numForms value 2

0 Karma

Esteemed Legend

That means that all Host_Name values occur in both data sets. And it is R-then-O, not O-then-R: numFroms.

0 Karma

Builder

well my csv file has assestname with .abc.int or .auther.inc ,i just want to remove these extensions in the csv in splunk ,can we do that.that will make easy to compare

example

Assestname
s4000cawv.abc.int
achaesdb01.auther.inc

0 Karma

Esteemed Legend

My solution ALREADY DOES THAT. It is this part: | rex field=Host_Name mode=sed "s/\.[^.]*$//"

0 Karma

Builder

I know you have taken a lot of time for me and helping me ,really thanks for that but Can you please tell me R-then-O, not O-then-R:numForms because I only see NumForm with 1. below is the query which I ran

index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name| append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]| append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name]|appendpipe[ inputlookup nexposeallassets.csv | table "Asset Names"]| appendpipe [|inputcsv max=0 "Asset Names" | eval From="CSVfile"] | rex field=Host_Name mode=sed "s/\.[^.]*$//" | stats dc(From) AS numFroms values(*) AS * by Host_Name
0 Karma

Esteemed Legend

You keep using f-O-R-m and I am using f-R-O-m. If you type it wrong, things will break.

These are the problems that I see with your search:
1(HUGE): Your search is much different than what you described in your question. I see 3 searches and 2 CSV files wherewas you described 1 search and 1 CSV.
2: You are missing a leading pipe (|) character before inputlookup so that segment of search will do nothing (no events).
3: You only have a single | eval From="<something> which is the key to getting the work done that you say that you need (I had | eval From="Search" that you did not use).

0 Karma

SplunkTrust
SplunkTrust

Give this a try

your search giving field host_name | eval From="Search" | append [ Your search to get field assetname from csv | eval host_name=mvindex(split(assetname,"."),0) | table host_name | eval From="CSV" ] | stats values(From) as From by host_name
| eval status=case(mvcount(From)=2,"Present in both", From="Search","Only available in search",1=1,"Only available in CSV") | table host_name status

Updated*

 index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
 | append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
 | append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
 | append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
  | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status

To filter the values which are unique to both combined and not present in both, add following to above search

above search | where status!="Present in both"

View solution in original post

Path Finder

Hey Splunkers,

Long time lurker first-time poster. I'm doing something similar but looking up IP addresses for brute force attacks. I'm sure I'm hitting a snag with my: |eval IP=mvindex(Source_address,0)

The .csv is pretty straightforward:

Source_address; count
10.0.0.5; 50
10.0.0.1; 2
10.0.0.4; 4

I get values back with my search but validating them, there are some discrepencies

index=yes success=1 ipaddress=* 
| table ipaddress 
| dedup ipaddress 
| rename ipaddress as IP 
| dedup IP
| eval From=1 
| append 
    [| inputlookup ohyes.csv 
    | table Source_address 
    | eval IP=mvindex(Source_address,0) 
    | table * 
    | eval From=2] 
| stats sum(From) as From by IP 
| eval status=case(From=3, "Present in both", From=1, "Only in search", 1=1, "Only in CSV") 
| where status="Present in both"
| table IP status

Any help would be greatly appreciated!

0 Karma

Builder

I tried the below query

index=os sourcetype="app-inventory" adhostname="*" |table adhostname |dedup adhostname | rename adhostname as HostName| append [search index=os sourcetype="app-inventory" altirishostname="" | table altirishostname |dedup altirishostname | rename altirishostname as HostName]| append [search index=os sourcetype="app-inventory" dns_hostname="" |table dnshostname| dedup dnshostname | rename dnshostname as HostName] | eval From="Search" | append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval HostName=mvindex(split("Asset Names","."),0) | table hostname | eval From="CSV" ] | stats values(From) as From by HostName
| eval status=case(mvcount(From)=2,"Present in both", From="Search","Only available in search",1=1,"Only available in CSV") | table Host
Name status

but the output only gives the hostnames available in search but not the whole list of all unique and csv

example

Host_Name status

s2324wee available

0 Karma

SplunkTrust
SplunkTrust

I just checked your subsearch from the nexposeallassets.csv and the field name in eval is Host_Name but the one used in table is host_name. The field names are case sensitive and could be the reason you don't see anything from the CSV. Update "| table host_name" with "| table Host_Name" and try again

0 Karma

Builder

I have changed that one also but I dont see anything from csv all I can see in status is field is only "Only available in search" but not the csv and a whole list of unique hostnames from both of them

0 Karma

SplunkTrust
SplunkTrust

Give this a try. I changed the | eval Host_Name=mvindex(split("Asset Names","."),0) with | eval Host_Name=mvindex(split('Asset Names',"."),0) and the status logic

index=os sourcetype="app-inventory" ad_hostname="*" |table ad_hostname |dedup ad_hostname | rename ad_hostname as Host_Name
| append [search index=os sourcetype="app-inventory"  altiris_hostname="*" | table altiris_hostname |dedup altiris_hostname | rename altiris_hostname as Host_Name]
| append [search index=os sourcetype="app-inventory" dns_hostname="*" |table dns_hostname| dedup dns_hostname | rename dns_hostname as Host_Name] | eval From=1
| append [| inputlookup nexposeallassets.csv |table "Asset Names" | eval Host_Name=mvindex(split('Asset Names',"."),0) | table host_name | eval From=2 ] | stats sum(From) as From by Host_Name
 | eval status=case(From=3,"Present in both", From=1,"Only available in search",1=1,"Only available in CSV") | table Host_Name status
0 Karma

Builder

Hey Somesoni
Thanks a lot it worked.But I want the unique from both combined but not the one present in both?

0 Karma

Builder

Somesoni2

my csv file has hostname with .abc.int or .auther.inc ,i just want to remove these extensions in the csv ,can we do that.that will make easy to compare

example

host_name
s4000cawv.abc.int
achaesdb01.auther.inc

0 Karma

SplunkTrust
SplunkTrust

HI vrmandadi, I believe that this might be a good use for the "join" command : http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Join

You'll be able to break the results down based on the join type (outer, left, inner).

Please let me know if this answers your question!

0 Karma

Builder

I tried using the join commands but the problem is that when it does the match the hostnames for csv has extensions for few of them and that is causing duplication,how can that be removed?

0 Karma