Getting Data In

8.0.1 upgraded Heavy Forwarder- TcpOutputProc - Possible duplication of events


We have a support ticket open, but I thought I'd also ask the community. Since upgrading our Splunk to 8.0.1 this one HF has been spewing "TcpOutputProc - Possible duplication of events " for most channels. As well as "TcpOutputProc - Applying quarantine to ip=xx.xx.xx.xx port=9998 _numberOfFailures=2"

We upgraded on the 15th near midnight. This is a count of those the errors from that host.
2020-02-14 0
2020-02-15 623
2020-02-16 923874
2020-02-17 396920
2020-02-18 678568
2020-02-19 602100
2020-02-20 459284
2020-02-21 1177642

Here is a count from the indexer cluster showing the number of blocked=true events. One would expect these to be similar in count if the indexers were telling the HF to go elsewhere because it's queues were full.

index=_internal host=INDEXERNAMES sourcetype=splunkd source=/opt/splunk/var/log/splunk/metrics.log blocked=true component=Metrics
| timechart span=1d count by source

2020-02-14 7
2020-02-15 180
2020-02-16 260
2020-02-17 15
2020-02-18 18
2020-02-19 2415
2020-02-20 1
2020-02-21 2

Lastly, it's not just one source or channel, it's everything from the host.

index=_internal component=TcpOutputProc host=ghdsplfwd01lps log_level=WARN duplication
| rex field=event_message "channel=source::(?[^|]+)"
| stats count by channel

/opt/splunk/var/log/introspection/disk_objects.log 51395
/opt/splunk/var/log/introspection/resource_usage.log 45470
mule-prod-analytics 42192
/opt/splunk/var/log/splunk/metrics.log 28283
web_ping://PROD_CommerceHub 27881
web_ping://V8_PROD_CustomSolr5 27877
web_ping://V8_PROD_WebServer4 27873
web_ping://EnterWorks PRD 27871
web_ping://RTP DEV 27870
web_ping://Ensighten 27869
web_ping://RTP 27867
bandwidth 20570
cpu 19949
iostat 19946
ps 19821

Any ideas?

0 Karma



If you have many separate transforms on props.conf for individual source/source type etc. try to combine those to one line e.g.

TRANSFORMS-foo = foo1
TRANSFORMS-bar = bar1


TRANSFORMS-foobar = foo1, bar1

This helps in our case after update 6.6.5 to 7.3.3.


0 Karma


The HF is still "sick" but here are some things we did that seemed to help.

  1. Edited the outputs.conf that this HF used to output to forwarders within it's own site.
  2. Removed useACK=true from outputs.conf

I'm a little concerned about #2 there. We could still be having issues with the outputs, only now the events are being dropped on the floor. In other words the condition may still be present, we have simply turned off the logging by removing useAck.

0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...