Splunk Search

Join searches by hostname when one index has the short name and the other index has long name.

jfraley
Path Finder

I am looking for away to join results from two indexes based on the hostname. The main index has the hostname as just name and the second index has it by just name.domain.com. The fields are spec.name and block. I tried to wild card it, but the results were erratic.

index=infrastructure_reports source=nutanix_vm_host_report | fields spec.name spec.cluster_reference.name  spec.resources.memory_size_mib 
|rename spec.name as block |join block* [ search index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini 
| fields block app_stack operating_system]
| table block spec.cluster spec.resources.memory_size_mib operating_system

 

Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

The most important lesson you can learn here is: Don't join.  Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main".  The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches.  You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string.  If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname.  If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

The most important lesson you can learn here is: Don't join.  Meanwhile, your description is inconsistent about which field is really hostname, and even which index is "main".  The following will mimic what you get but with better performance.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog
  path=/tmp/jira_assets_extract.ini)
|eval block = coalesce(block, 'spec.name')
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

I vaguely get the sense that the "main" index - I assume that's infrastructure which produces spec.name field - lacks domain.com in some events, causing missed matches.  You cannot solve this problem by wildcard.

One key piece of information you also did not clarify is what possible values of "domain.com" are, given that this is simply a stand-in string.  If there are more than one value for "domain.com", and "name" part could match multiple "domain.com" and represent different hostnames, your problem is unsolvable.

The only way the problem is solvable is if "domain.com" doesn't matter, i.e., if "name" part is unique for any hostname.  If this is the case, you can strip out the "domain.com" part in spec.name.

(index=infrastructure_reports source=nutanix_vm_host_report)
OR (index=syslog_main source=/var/log/messages sourcetype=linux_messages_syslog path=/tmp/jira_assets_extract.ini)
| rex field=spec.name "^(?<block>[^\.]+)"
| fields block spec.cluster spec.resources.memory_size_mib operating_system
| stats values(*) as * by block

 

Tags (1)

jfraley
Path Finder

That is perfect. Exactly what I needed. This was the most helpful reply to any question I think I have ever posted to a forum.

0 Karma

LAME-Creations
SplunkTrust
SplunkTrust

This is exactly the way to solve this problem.  Honestly, as you start to really master splunk you will find that Stats seems to be the answer for everything.  

this is a very helpful presentation on your very problem.  

let-stats-sort-them-out-building-complex-result-sets-that-use-multiple-source-types.pdf
slide 33

0 Karma
Get Updates on the Splunk Community!

What the End of Support for Splunk Add-on Builder Means for You

Hello Splunk Community! We want to share an important update regarding the future of the Splunk Add-on Builder ...

Solve, Learn, Repeat: New Puzzle Channel Now Live

Welcome to the Splunk Puzzle PlaygroundIf you are anything like me, you love to solve problems, and what ...

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...