Splunk Search

How to compare field from one index to multiple field in other index?

Navanitha
Path Finder

Good day,

I have a usecase explained below -

Index A has Reporting_Host (mix of IP address, hostname, FQDN) and Index CMDBB had data from CMDB ( so contains hostname, FQDN, IP Address, Server owner information etc ). My requirement is to map Reporting_Host data from index A again CMDB data and display server owner information along with hostname, IP etc.  Issue here is, Index CMDB has data in multiple fields like Hostname (contains servername), CI_Name(contains FQDN), IP_address(obvious IP address).  How do I match Reporting_Host field values agains these 3 fields in CMDB and display the output?

I tried using join but able to compare with any one field in CMDB data but not 3. Sample query below -

index=A sourcetype=syslog_stats | stats min(_time) as old, max(_time) as new by Reporting_Host | stats min(old) as oldest, max(new) as newest by Reporting_Host | eval diff = tostring((newest - oldest), "duration") | where newest < now() - (86400 * 2) | eval stopped= (now()-newest) | eval stopped_for = round(stopped/86400, 0) | convert ctime(oldest) | convert ctime(newest) | join Reporting_Host [ search index=CMDB | rename HostName as Reporting_Host ] | fields oldest newest diff stopped_for Reporting_Host Server_Owner

I did a field alias for CI_Name, IP_address and Hostname and named it as HostName but its not working.

Labels (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

You may want to clarify your requirements.  If I take your description literally, I see two indices, each contain three fields, IP address, hostname, and FQDN, and each contain some extra fields; I would interpret your desire to match multiple fields as a requirement to have exact match of all three fields.  Assuming that matching field names are identical in both indices, there is no reason to not add all three fields to join command.  Is this what you wanted?

But really, you should avoid using join in the first place because it is inefficient.  That is why there is a special section Alternative commands in that document.  The best one would be a combination of OR and stats, like

(index=A sourcetype=syslog_stats) OR index=CMDB
| stats min(_time) as old, max(_time) as new
 values(owner_info1) as owner_info1 values(owner_info2) as owner_info2
 by IP hostname FQDN

I have some suspicion that this is NOT the actual requirement, however, because you mentioned only one field Reporting_Host in index A, but described it as "mix of IP address, hostname, FQDN".  If you don't illustrate what this "mix" means, no one can really help you.

Do you mean that "Reporting_Host" is a string that combines IP, hostname, and FQDN that can match corresponding values in index CMDB?  If I assume that Reporting_Host is of format "127.0.0.0-zeus-jupiter.mycompany.com", i.e., "<ipv4>-<hostname>-<FQDN>", you can split it into the three components and run stats just like above.  In this example, you do

index=A sourcetype=syslog_stats OR index=CMDB
| rex field=Reporting_Host "(?<IP>[^-]+)-(?<hostname>[^-]+)-(?<FQDN>.+)"
| stats min(_time) as old, max(_time) as new values(owner_info1) as owner_info1 values(owner_info2) as owner_info2 by IP hostname FQDN

 

Tags (1)
0 Karma

Navanitha
Path Finder

Hi @yuanliu 

Thank you for looking into it.  When I say mix of values (Reporting_host), I meant like below

Reporting_Host (not a string that contains IP host and FQDN but its like below but list of individual values)

abc@domain.com

100.10.10.10

abc

xyz

44.16.23.01

xyz@domain.com

and goes on..

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

This explanation is much clearer.  So, Reporting_Host could be any of the three things, and you don't have a reliable way to predict which one it is in a given event.  If so, you will need to process it first to know which one each event comes with.  For example,

(index=A sourcetype=syslog_stats) OR index=CMDB
| rex field=Reporting_Host "^[^@]+@(?<FQDN>.+)"
| rex field=Reporting_Host "^(?<hostname>[^\.]+)$"
| rex field=Reporting_Host "^(?<IP>\d+(\.\d+){3}$")
| stats min(_time) as old, max(_time) as new
 values(owner_info1) as owner_info1 values(owner_info2) as owner_info2
 by IP hostname FQDN

Because you didn't provide additional identifying information/criteria, the above takes a lot of assumptions about your data, e.g., "hostname contains no dot (.)."  If you have additional conditions to help figure out which event comes with which, definitely use it.

0 Karma

Navanitha
Path Finder

😞

I will put some sample data here for better understanding of my usecase.

Index A contains events like below.  Here I already did a field extraction for IP address Hostname FQDN and named it as Reporting host.  This is basically list of servers that are forwarding logs.

2023-02-21 00:14:43.6543016070 2.2.2.2
2023-02-21 00:14:43.6213010920 abc.domain.net
2023-02-21 00:14:43.6543016070 4.3.2.1
2023-02-21 00:14:43.6213010920 xyz.domain.net
2023-02-21 00:14:43.6543016070 1.1.1.1
2023-02-21 00:14:43.6213010920 pqr

so when I index=A | stats count by Reporting_host, the result is below.  First of the query works.  

Reporting_Host

2.2.2.2
abc.domain.net
4.3.2.1
xyz.domain.net
1.1.1.1
pqr

Now I have second index which is from our CMDB which contains Server information like owner, BU, location etc.

2023-02-20 07:47:14.269, CI_Name="xyz.domain.net", Hostname="xyz", Domain="example.com", Environment="QA", IP_Address="3.5.4.6", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="Owner3"

2023-02-20 07:47:14.269, CI_Name="pqr.domain.net", Hostname="pqr", Domain="example.com", Environment="QA", IP_Address="13.15.14.16", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner2"

2023-02-20 07:47:14.269, CI_Name="Host1.domain.net", Hostname="Host1", Domain="example.com", Environment="QA", IP_Address="2.2.2.2", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner1"

My usecase is I need to map the Reporting_host from index A with index B data and get the server information.  Challenge I am facing here is, Reporting_host as shown above is list of IP address, FQDN, Hostnames ( all comes in separate events like some hosts are reporting with IP address, some are reporting with FQDN and rest with IP address so the results) so how can I compare REporting_host with CMDB data and get the Server information.  If the Reporting_host comes in one format like hostname or IP address the comparison could have been easy so I will just use join and the common field would be Reporting_host comparing it with Hostname in index B.  Since the Reporting_host is has 3 different formats how do I compare them ?  I hope I explained my usecase in detail now.  Sorry for too many post.

For above ex: my output should be

Reporting_Host Server_Owner

2.2.2.2 Owner1

xyz.domain.net Owner3

pqr Owner2

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Yes, if you want specific help, it is extremely important to illustrate and explain your data.  The main difference between this latest illustration and what you explained previously is in field names.  Also, that Reporting_Host field does not contain E-mail address format.  Therefore I will not bother to match that.  Otherwise, the general formula remains unchanged.

 

(index=A sourcetype=syslog_stats) OR index=CMDB
| rex field=Reporting_Host "^(?<Hostname>[^\.]+)$"
| rex field=Reporting_Host "^(?<IP_Address>\d+(\.\d+){3})$"
| rex field=Reporting_Host "^(?<CI_Name>[^\.]+(\.[^\.]+)+)$"
| eval CI_Name = if(index=="A" AND isnotnull(IP_Address), null(), CI_Name)
| stats values(Server_Owner) as Server_Owner by IP_Address CI_Name Hostname

 

Using your samples, the first line will give you events like these

_raw_time
2023-02-20 07:47:14.269, CI_Name="xyz.domain.net", Hostname="xyz", Domain="example.com", Environment="QA", IP_Address="3.5.4.6", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="Owner3"2023-02-20 07:47:14.269
2023-02-20 07:47:14.269, CI_Name="pqr.domain.net", Hostname="pqr", Domain="example.com", Environment="QA", IP_Address="13.15.14.16", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner2"2023-02-20 07:47:14.269
2023-02-20 07:47:14.269, CI_Name="Host1.domain.net", Hostname="Host1", Domain="example.com", Environment="QA", IP_Address="2.2.2.2", Tier1="Hosting", Tier2="Processing unit", Tier3="Server", Operating_System="windows", OS_Version="Server 2016", Priority="Priority_5", Server_Owner="owner1"2023-02-20 07:47:14.269
2023-02-21 00:14:43.6543016070 2.2.2.22023-02-21 00:14:43.6543016070
2023-02-21 00:14:43.6213010920 abc.domain.net2023-02-21 00:14:43.6213010920
2023-02-21 00:14:43.6543016070 4.3.2.12023-02-21 00:14:43.6543016070
2023-02-21 00:14:43.6213010920 xyz.domain.net2023-02-21 00:14:43.6213010920
2023-02-21 00:14:43.6543016070 1.1.1.12023-02-21 00:14:43.6543016070
2023-02-21 00:14:43.6213010920 pqr2023-02-21 00:14:43.6213010920

And the full search will give you

IP_AddressCI_NameHostnameServer_Owner
13.15.14.16pqr.domain.netpqrowner2
2.2.2.2Host1.domain.netHost1owner1
3.5.4.6xyz.domain.netxyzOwner3
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...