Splunk Search

Help with join - Periodically it fails to join a small percentage of records

jmo1
Path Finder

I have a query to find missing forwarders.  It is based on code I received here and it is so very close to working.  Here is the code and my issue is below.

 

(fwdType=* group=tcpin_connections guid=* index=_internal sourcetype=splunkd (connectionType=cooked OR connectionType=cookedSSL)) 
| stats values(fwdType) as forwarder_type, latest(version) as version, values(arch) as arch, values(os) as os, max(_time) as last_connected, sum(kb) as new_sum_kb, avg(tcp_KBps) as new_avg_tcp_kbps, avg(tcp_eps) as new_avg_tcp_eps by guid, hostname
| stats values(forwarder_type) as forwarder_type, max(version) as version, values(arch) as arch, values(os) as os, max(last_connected) as last_connected, values(new_sum_kb) as sum_kb, values(new_avg_tcp_kbps) as avg_tcp_kbps, values(new_avg_tcp_eps) as avg_tcp_eps by guid, hostname 
| addinfo 
| eval status=if(((isnull(sum_kb) OR (sum_kb <= 0)) OR (last_connected < (info_max_time - 60))),"missing","active"), sum_kb=round(sum_kb,2), avg_tcp_kbps=round(avg_tcp_kbps,2), avg_tcp_eps=round(avg_tcp_eps,2) 
| eval age = now() - last_connected
| search age > 60 
| sort age d 
| convert ctime(last_connected) 
| rename hostname as host
| fields host, forwarder_type, version, arch, os, status, last_connected, age, sum_kb, avg_tcp_kbps, avg_tcp_eps
| join type=left host [| tstats count where index="*" by host, index | stats values(index) as indexes by host]

| append [| inputlookup MaintenanceToggle.csv |fields MaintenanceONOFF]|reverse|filldown MaintenanceONOFF |reverse|where MaintenanceONOFF!="ON"
| search NOT [ | inputlookup PreProduction.csv | rename Index as indexes ]
| search host=* NOT [ | inputlookup Forwarders.csv | rename IgnoreForwarder as host ]

 

Everything up to the join is code I found here and appears to work perfectly.  The result is all of the hosts where age > 60.  This works every time.

| join type=left host [| tstats count where index="*" by host, index | stats values(index) as indexes by host]

 

The join is meant to link a host with its index.  As my goal is to exempt certain indexes.  The join appears to work but occasionally fails to  link a host with the index, resulting in the proper results with a few hosts missing an index.  Obviously all of the hosts have an index.  

I am fairly confident of the remaining logic as it is working well for many other queries.  So the problem is that the join fails to join 100% of the time.  I have this scheduled to run every five minutes.  I can replicate the results using the exact time frame in which the query occurred and minutes later the exact same search will return properly.  Can anyone suggest what might be the cause of this and how to correct?

Thank you in advance.

Labels (3)
0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Watch On Demand the Tech Talk on November 6 at 11AM PT, and empower your SOC to reach new heights! Duration: ...

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...