Reporting

compare splunk search with lookup and output the difference

Richy_s
Path Finder

Can someone help me understand what I am doing wrong here?  

My requirement is I have a index=prod_syslogfarm which will report on the devices forwarding logs to the syslog collectors.  The devices may report with either hostname / IP address / fqdn.  Now, I have to compare this with our master asset inventory (which is the lookup below myinventory.csv) and create a report with the host names that are not seen in prod_syslogfarm index.  I am making hostname as common field for the main search and the lookup file and below is my query.   

Below query is not working as the report contains the hostnames that are there in the syslogfarm index.

index=prod_syslogfarm | stats count by IP_Address | lookup myinventory.csv IP_Address OUTPUT Hostname | table IP_Address Hostname | rename Hostname as Reporting_Host | appendcols [ search index=prod_syslogfarm | eval fqdn_hostname=lower(fqdn_hostname) | eval Reporting_Host=lower(Reporting_Host) | eval Reporting_Host=mvappend(Reporting_Host, fqdn_hostname) ] | dedup Reporting_Host | table Reporting_Host | rename Reporting_Host as Hostname | appendcols
[inputlookup myinventory.csv | eval Hostname=lower(Hostname)
| stats values(Hostname) as cmdb_hostname by Hostname ]
| eval missingname = mvmap(cmdb_hostname, if(cmdb_hostname != Hostname, cmdb_hostname, null()))
| table missingname | mvexpand missingname | lookup myinventory.csv Hostname as missingname OUTPUT Environment Tier3 Operating_System
| table missingname Environment Tier3 Operating_System
Labels (1)
Tags (2)
0 Karma

Richy_s
Path Finder
| inputlookup cmdb_asset_inventory.csv | rename Generic_Hostname as Reporting_Host | eval Reporting_Host=lower(Reporting_Host)
| stats count by Reporting_Host Hostname Environment Tier3 Operating_System | join type=left [ search index=prod_syslogfarm | eval Reporting_Host=lower(Reporting_Host)
| stats values(Reporting_Host) as Exists by Reporting_Host ] | fillnull Exists value=0 | search Exists = 0

 

FYI, query to obtain the cmdb_asset_inventory

index=cmdb | eval Generic_Hostname=mvappend(Hostname, IP_Address)

.  

In the cmdb_asset_inventory, a hostname may contain multiple IP addresses(as you see below).

Output - 

Reporting_Host Hostname Environment Tier3 Operating_System count Exists
1.11.12.13 xyz Production Server Windows Server 2022 1 0
1.0.1.1 xyz Production Server Windows Server 2022 1 0
xyz xyz Production Server Windows Server 2022 1 xyz
xyz.abc.com xyz Production Server Windows Server 2022 1 0

 

I've been able to achieve partial success with the query where Exists=xyz. The challenge I'm facing is that the host "xyz" is reporting with the hostname "xyz", and I'm able to look up this hostname in the inventory. Once a match is found, it should ignore all other combinations since "xyz" in the syslog host is present in the inventory lookup.

 

I tried my best to explain my requirement, apologies if something above doesn't make sense.  I will try to be more clear.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| lookup cmdb_asset_inventory.csv Reporting_Host as IP_Address
| lookup cmdb_asset_inventory.csv Reporting_Host as fqdn_hostname
| lookup cmdb_asset_inventory.csv Reporting_Host as hostname
| stats count by Hostname
| append
    [| inputlookup cmdb_asset_inventory.csv 
    | stats count by Hostname]
| stats count by Hostname
| where count=1
0 Karma

Richy_s
Path Finder

@ITWhisperer Could you explain how this works ? do I need to append this to my original query ? I don't see the syslog_farm index used anywhere in your search query.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Sorry, I thought that was obvious.

index=prod_syslogfarm 
| lookup cmdb_asset_inventory.csv Reporting_Host as IP_Address
| lookup cmdb_asset_inventory.csv Reporting_Host as fqdn_hostname
| lookup cmdb_asset_inventory.csv Reporting_Host as hostname
| stats count by Hostname
| append
    [| inputlookup cmdb_asset_inventory.csv 
    | stats count by Hostname]
| stats count by Hostname
| where count=1

The way it works is to lookup using the ip address, fqdn hostname and host name using data from the events, then gets a list of Hostnames that have matched the lookup. Next append a list of hostnames from the lookup file. Now when you count the hostnames, when the count is 1 they only appear in the lookup not to events (which would have hostname counts of 2).

0 Karma

Richy_s
Path Finder

@ITWhisperer Appreciate your help on this but the query doesn't seem to be working.  For count=1 / count =2 I see the events appear in both lookup and indexed events.  

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Please share more details

0 Karma

richgalloway
SplunkTrust
SplunkTrust

FWIW, the appendcols command rarely factors into a solution.  The conditions for it to work correctly are to narrow.

The general form to solve a problem like this is to search the index for field values that are not in the lookup table.

index=prod_syslogfarm NOT [ | inputlookup myinventory.csv IP_Address | fields IP_Address | format ]
| stats count by IP_Address 
| lookup myinventory.csv IP_Address OUTPUT Hostname Environment Tier3 Operating_System
| rename Hostname as missingname
| table missingname Environment Tier3 Operating_System
---
If this reply helps you, Karma would be appreciated.
0 Karma

Richy_s
Path Finder

Thank you @richgalloway  But this does not meet my requirement as my syslog data contains combination of hostname, fqdn and IP address and I have to match all these three fields with the respective asset inventory data which has these fields (Hostname, IP address, FQDN).  So I have to check if the syslog hostname or IP or fqdn is present in the aset inventory data and output only if the syslog data doesn't match with any of these three fields in the asset inventory data.  I have posted this question with examples in this link.  Although I have accepted the answer here https://community.splunk.com/t5/Splunk-Search/How-to-compare-a-look-up-field-with-multivalued-indexe... after further testing, it doesn't seem to be working as expected.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming the values of the ip address, hostname and fqdn_hostname are unique in your lookup, you could try something like this

index=prod_syslogfarm 
| append [ | inputlookup myinventory.csv]
| eval host=coalesce(lower(hostname),lower(Hostname))
| eventstats count as host_count by host
| eval ip=coalesce(ip_address,IP_Address)
| eventstats count as ip_count by ip
| eval fqdnhost=coalesce(lower(fqdn_hostname),lower(FQDN_Hostname))
| eventstats count as fqdn_count by fqdnhost
| where host_count=1 OR ip_count=1 OR fqdn_count=1

You may need to adjust depending on your actual field names in your index and lookup file

0 Karma

Richy_s
Path Finder

@ITWhispererI attempted to execute your search by my goal is to identify and output the assets that are present in `myinventory` lookup but absent from the `syslog_farm` index.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Please share the search that you have been trying.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...