I am looking for away to join results from two indexes based on the hostname. The main index has the hostname as just name and the second index has it by just name.domain.com. The fields are spec.name and block. I tried to wild card it, but the results were erratic.
index=infrastructure_reports source=nutanix_vm_host_report | fields spec.name spec.cluster_reference.name spec.resources.memory_size_mib
|rename spec.name as block |join block* [ search index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini
| fields block app_stack operating_system]
| table block spec.cluster spec.resources.memory_size_mib operating_system
The most important lesson you can learn here is: Don't join. Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main". The following will mimic what you get but with better performance.
(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block
I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches. You cannot solve this problem by wildcard.
One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string. If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.
The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname. If this is the case, you can strip out the "domain.com" part in spec.name.
(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block
The most important lesson you can learn here is: Don't join. Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main". The following will mimic what you get but with better performance.
(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block
I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches. You cannot solve this problem by wildcard.
One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string. If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.
The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname. If this is the case, you can strip out the "domain.com" part in spec.name.
(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block
That is perfect. Exactly what I needed. This was the most helpful reply to any question I think I have ever posted to a forum.
This is exactly the way to solve this problem. Honestly, as you start to really master splunk you will find that Stats seems to be the answer for everything.
this is a very helpful presentation on your very problem.
let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf
slide 33