<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: why am I seeing duplicate events in my metrics indexes? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/544252#M90829</link>
    <description>&lt;P&gt;note:&lt;BR /&gt;if you are trying to set the output group to your token via splunk web UI you may notice that no output groups show up.&lt;BR /&gt;Settings&amp;gt;Data&amp;gt;Inputs&amp;gt;HTTP Event Collect &amp;gt; New Token&lt;/P&gt;&lt;P&gt;You must configure the groups for disabled = false in outputs.conf for them to appear in the UI.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ie:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[tcpout]
defaultGroup = clustered_indexers_with_useACK
disabled = false

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true
disabled = false

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false
disabled = false&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 17 Mar 2021 18:58:07 GMT</pubDate>
    <dc:creator>rphillips_splk</dc:creator>
    <dc:date>2021-03-17T18:58:07Z</dc:date>
    <item>
      <title>why am I seeing duplicate events in my metrics indexes?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541268#M90562</link>
      <description>&lt;P&gt;I am seeing duplicate events in a metrics index, help!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;deployment flow:&lt;BR /&gt;hec client---&amp;gt;load balancer---&amp;gt;HFs (hec receivers)---&amp;gt;Indexers (metrics index)&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2021 20:32:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541268#M90562</guid>
      <dc:creator>rphillips_splk</dc:creator>
      <dc:date>2021-02-24T20:32:59Z</dc:date>
    </item>
    <item>
      <title>Re: why am I seeing duplicate events in my metrics indexes?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541269#M90563</link>
      <description>&lt;P&gt;deployment flow:&lt;BR /&gt;client---&amp;gt;load balancer---&amp;gt;HFs---&amp;gt;Indexers&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2021 20:18:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541269#M90563</guid>
      <dc:creator>rphillips_splk</dc:creator>
      <dc:date>2021-02-24T20:18:31Z</dc:date>
    </item>
    <item>
      <title>Re: why am I seeing duplicate events in my metrics indexes?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541270#M90564</link>
      <description>&lt;P&gt;note: HF is using useACK=true in outputs.conf which is causing the duplication of events.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;root cause:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;useACK is not available for events where the _raw field is missing. By design, metrics data does not contain an _raw field.&lt;BR /&gt;Do not use useACK=true if you are sending events to a metrics index where the event is missing _raw.&amp;nbsp;useACK&amp;nbsp;is implemented to track&amp;nbsp;_raw. If&amp;nbsp;_raw&amp;nbsp;is not presented, there is no ACK sent back (from indexer to HF) which will cause duplicate events.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;produce the problem:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;send 1 json event to metrics index:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;curl -k https://lb.sv.splunk.com:8088/services/collector \
-H "Authorization: Splunk &amp;lt;token&amp;gt;" \
-d '{"time": 1614193927.000,"source":"disk","host":"host_77","fields":{"region":"us-west-1","datacenter":"us-west-1a","rack":"63","os":"Ubuntu16.10","arch":"x64","team":"LON","service":"6","service_version":"0","service_environment":"test","path":"/dev/sda1","fstype":"ext3","_value":1099511627776,"metric_name":"total"}}'&lt;/LI-CODE&gt;&lt;P&gt;wait 5m and search the metric index and see the duplicate event&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;| msearch index="mymetrics"&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;see the ACK timeout and duplication because indexers never send back ACK so the HF sends the event again and again every 300s.&lt;/P&gt;&lt;P&gt;ie:&lt;BR /&gt;&lt;STRONG&gt;HF splunkd.log:&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;02-24-2021 14:47:49.702 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:47:49.703 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997

02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;solution:&lt;/STRONG&gt;&lt;BR /&gt;1.) create two output groups on the HF, one for event data and one for metric data. For the outputgroup for metric data set useACK=false&lt;BR /&gt;ie: HF:&lt;BR /&gt;outputs.conf:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[tcpout]
defaultGroup = clustered_indexers_with_useACK

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;2.) on specific http input stanzas where metric data is received, use outputgroup attribute to send the the outputgroup where useACK=false&lt;/P&gt;&lt;P&gt;HF inputs.conf:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[http://&amp;lt;name&amp;gt;]
outputgroup = &amp;lt;string&amp;gt;
* The name of the forwarding output group to send data to.
* Default: empty string&lt;/LI-CODE&gt;&lt;P&gt;example HF:&lt;BR /&gt;inputs.conf&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[http]
busyKeepAliveIdleTimeout = 90
dedicatedIoThreads = 2
disabled = 0
enableSSL = 1
port = 8088
queueSize = 2MB


[http://metrics_data]
disabled = 0
host = hf1
index = mymetrics
indexes = mymetrics
sourcetype = _json
token = &amp;lt;token&amp;gt;
outputgroup = clustered_indexers_without_useACK

[http://event_data]
disabled = 0
host = hf1
index = main
indexes = main
sourcetype = _json
token = &amp;lt;token&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;Since defaultGroup= clustered_indexers_with_useACK , if we don't specify any outputgroup the data gets sent to the default outputgroup so we dont need to declare it in the [http://event_data] input stanza.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2021 20:23:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/541270#M90564</guid>
      <dc:creator>rphillips_splk</dc:creator>
      <dc:date>2021-02-24T20:23:50Z</dc:date>
    </item>
    <item>
      <title>Re: why am I seeing duplicate events in my metrics indexes?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/544252#M90829</link>
      <description>&lt;P&gt;note:&lt;BR /&gt;if you are trying to set the output group to your token via splunk web UI you may notice that no output groups show up.&lt;BR /&gt;Settings&amp;gt;Data&amp;gt;Inputs&amp;gt;HTTP Event Collect &amp;gt; New Token&lt;/P&gt;&lt;P&gt;You must configure the groups for disabled = false in outputs.conf for them to appear in the UI.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ie:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[tcpout]
defaultGroup = clustered_indexers_with_useACK
disabled = false

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true
disabled = false

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false
disabled = false&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Mar 2021 18:58:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/why-am-I-seeing-duplicate-events-in-my-metrics-indexes/m-p/544252#M90829</guid>
      <dc:creator>rphillips_splk</dc:creator>
      <dc:date>2021-03-17T18:58:07Z</dc:date>
    </item>
  </channel>
</rss>

