In a nutshell, you need to search both for forwarders and for the hosts. Then you can determine if it's a host problem or a forwarder problem.
Here is the dashboard panel I use for this:
<module name="HiddenSearch" layoutPanel="panel_row5_col1" autoRun="True">
<!-- Find and report on all Splunk Universal Forwarders and endpoints not running SUF. Skip IPs in the SUFExceptions file. -->
<param name="search"><![CDATA[index=_internal source="/opt/splunk/var/log/splunk/metrics.log*" sourcetype="splunkd" fwdType="*" |
dedup sourceHost | rename IPAddress AS hostip, sourceHost AS IPAddress, OS AS fOS |
fields IPAddress, hostname, fGUID, fOS, fwdType | append [loadjob savedsearch="my:app:HWDetailBase" |
rename OS AS hOS | fields IPAddress, ComputerName, hOS] |
transaction IPAddress |
eval HostName=coalesce(ComputerName, hostname) | eval OS=coalesce(hOS, fOS) |
eval "Forwarder State"=if(isnotnull(fwdType),"Running","NOT RUNNING") |
search [|inputlookup SUFExceptions.csv append=f| fields IPAddress |format "NOT (" "(" "" ")" "OR" ")"] |
sort "Forwarder State" | table IPAddress, HostName, OS, "Forwarder State"
]]></param>
<param name="groupLabel">Forwarder Status</param>
<module name="JobProgressIndicator"></module>
<param name="earliest">-24h</param>
<param name="latest">now</param>
<module name="PostProcess" layoutPanel="panel_row5_col1">
<param name="search"> | rename "Forwarder State" AS fState |
stats count(eval(fState=="NOT RUNNING")) AS nRun</param>
<module name="HTML" layoutPanel="panel_row5_col1">
<param name="html"><![CDATA[
<table>
<tr><td>Hosts:</td><td width=3></td><td>$results.resultCount$</td><td width=8></td><td>Not running:</td><td width=3></td><td>$results[0].nRun$</td></tr>
</table>
]]></param>
</module>
</module>
The SUFExceptions.csv file contains a single field, IPAddress, and is where I put hosts I know aren't running a forwarder. It saves modifying a lengthy where clause every time there's a change to the exception list.
The HWDetailBase search is a bit too long to list here, but it essentially combines all of our sources of host information (such as port_scan) and returns IPAddress, ComputerName, and OS fields.
... View more