This issue has been identified as a product defect, with reference DEPMON-142.
In Splunk Enterprise 6.2, indexers are logging new events to metrics.log/group=tcpin_connections to record forwarder connection events, such as a connection closing:
11-13-2014 12:31:39.967 -0800 INFO StatusMgr - destPort=9997, eventType=connect_close, group=tcpin_connections, sourceHost=10.140.126.97, sourceIp=10.140.126.97, sourcePort=54692, statusee=TcpInputProcessor
Unfortunately, the Deployment Monitor searches do not expect these events under group=tcpin_connections and only expect records reporting metrics, such as this one:
11-13-2014 12:33:14.272 -0800 INFO Metrics - group=tcpin_connections, 127.0.1.1:33018:9997, connectionType=cooked, sourcePort=33018, sourceHost=127.0.1.1, sourceIp=127.0.1.1, destPort=9997, kb=10.08, _tcp_Bps=332.94, _tcp_KBps=0.33, _tcp_avg_thruput=0.14, _tcp_Kprocessed=354.89, _tcp_eps=0.42, _process_time_ms=1, old_evt_kBps=0.32, evt_misc_kBps=0.00, evt_raw_kBps=0.00, evt_fields_kBps=0.00, evt_fn_kBps=0.00, evt_fv_kBps=0.00, evt_fn_str_kBps=0.00, evt_fn_meta_dyn_kBps=0.00, evt_fn_meta_predef_kBps=0.00, evt_fn_meta_str_kBps=0.00, evt_fv_num_kBps=0.00, evt_fv_str_kBps=0.00, evt_fv_predef_kBps=0.00, evt_fv_offlen_kBps=0.00, build=149561, version=5.0.2, os=Linux, arch=x86_64, hostname=sosdev-ufwd-8, guid=EA1B8A53-350D-42D4-A08A-2670EC46208D, fwdType=uf, ssl=false, lastIndexer="10.140.48.33:9997,10.140.49.8:9997,127.0.1.1:9997", ack=true
This causes the logic of some searches in the Deployment Monitor app to fail, most notably those that list forwarders and/or attempt to detect missing forwarders.
The fix is simple and requires to re-scope the base search in the "forwarder_metrics" macro to always exclude the connection events and keep only the metric events.
There is a simple work-around, fortunately. Follow these steps, which assume that you have Deployment Monitor 5.0.3 installed:
Edit $SPLUNK_HOME/etc/apps/splunk_deployment_monitor/default/macros.conf
Find the definition of the "forwarder_metrics" macro on line 155 and change it like so:
Before:
index="_internal" source="metrics.lo" group=tcpin_connections | ...
After:
index="_internal" source="*metrics.lo*" group=tcpin_connections NOT eventType=* | ...
Restart Splunk or hit Splunk Web's .../debug/refresh endpoint to dynamically reload macro definitions
... View more