Solved: Join searches by hostname when one index has the s...

jfraley · ‎06-20-2025

I am looking for away to join results from two indexes based on the hostname. The main index has the hostname as just name and the second index has it by just name.domain.com. The fields are spec.name and block. I tried to wild card it, but the results were erratic.

index=infrastructure_reports source=nutanix_vm_host_report | fields spec.name spec.cluster_reference.name  spec.resources.memory_size_mib 
|rename spec.name as block |join block* [ search index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini 
| fields block app_stack operating_system]
| table block spec.cluster spec.resources.memory_size_mib operating_system

yuanliu · ‎06-20-2025

The most important lesson you can learn here is: Don't join. Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main". The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches. You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string. If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname. If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

View solution in original post

yuanliu · ‎06-20-2025

The most important lesson you can learn here is: Don't join. Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main". The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches. You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string. If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname. If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

jfraley · ‎06-21-2025

That is perfect. Exactly what I needed. This was the most helpful reply to any question I think I have ever posted to a forum.

LAME-Creations · ‎06-20-2025

This is exactly the way to solve this problem. Honestly, as you start to really master splunk you will find that Stats seems to be the answer for everything.

this is a very helpful presentation on your very problem.

let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf
slide 33

Join searches by hostname when one index has the short name and the other index has long name.

join

subsearch

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you a member of the Splunk Community?