Splunk Search

Join searches by hostname when one index has the short name and the other index has long name.

jfraley
Path Finder

I am looking for away to join results from two indexes based on the hostname. The main index has the hostname as just name and the second index has it by just name.domain.com. The fields are spec.name and block. I tried to wild card it, but the results were erratic.

index=infrastructure_reports source=nutanix_vm_host_report | fields spec.name spec.cluster_reference.name  spec.resources.memory_size_mib 
|rename spec.name as block |join block* [ search index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini 
| fields block app_stack operating_system]
| table block spec.cluster spec.resources.memory_size_mib operating_system

 

Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

The most important lesson you can learn here is: Don't join.  Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main".  The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches.  You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string.  If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname.  If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

The most important lesson you can learn here is: Don't join.  Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main".  The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches.  You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string.  If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname.  If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

 

Tags (1)

jfraley
Path Finder

That is perfect. Exactly what I needed. This was the most helpful reply to any question I think I have ever posted to a forum.

0 Karma

LAME-Creations
Path Finder

This is exactly the way to solve this problem.  Honestly, as you start to really master splunk you will find that Stats seems to be the answer for everything.  

this is a very helpful presentation on your very problem.  

let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf
slide 33

0 Karma
Get Updates on the Splunk Community!

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

In today’s complex digital landscape, security teams face increasing pressure to protect sprawling data across ...

Your summer travels continue with new course releases

Summer in the Northern hemisphere is in full swing, and is often a time to travel and explore. If your summer ...

From Alert to Resolution: How Splunk Observability Helps SREs Navigate Critical ...

It's 3:17 AM, and your phone buzzes with an urgent alert. Wire transfer processing times have spiked, and ...