Getting Data In

why am I seeing duplicate events in my metrics indexes?

rphillips_splk
Splunk Employee
Splunk Employee

I am seeing duplicate events in a metrics index, help!

 

deployment flow:
hec client--->load balancer--->HFs (hec receivers)--->Indexers (metrics index)

Labels (2)
Tags (1)
1 Solution

rphillips_splk
Splunk Employee
Splunk Employee

note: HF is using useACK=true in outputs.conf which is causing the duplication of events.

root cause:

useACK is not available for events where the _raw field is missing. By design, metrics data does not contain an _raw field.
Do not use useACK=true if you are sending events to a metrics index where the event is missing _raw. useACK is implemented to track _raw. If _raw is not presented, there is no ACK sent back (from indexer to HF) which will cause duplicate events.


produce the problem:

send 1 json event to metrics index:

curl -k https://lb.sv.splunk.com:8088/services/collector \
-H "Authorization: Splunk <token>" \
-d '{"time": 1614193927.000,"source":"disk","host":"host_77","fields":{"region":"us-west-1","datacenter":"us-west-1a","rack":"63","os":"Ubuntu16.10","arch":"x64","team":"LON","service":"6","service_version":"0","service_environment":"test","path":"/dev/sda1","fstype":"ext3","_value":1099511627776,"metric_name":"total"}}'

wait 5m and search the metric index and see the duplicate event

| msearch index="mymetrics"


see the ACK timeout and duplication because indexers never send back ACK so the HF sends the event again and again every 300s.

ie:
HF splunkd.log:

02-24-2021 14:47:49.702 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:47:49.703 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997

02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997

 


solution:
1.) create two output groups on the HF, one for event data and one for metric data. For the outputgroup for metric data set useACK=false
ie: HF:
outputs.conf:

[tcpout]
defaultGroup = clustered_indexers_with_useACK

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false


2.) on specific http input stanzas where metric data is received, use outputgroup attribute to send the the outputgroup where useACK=false

HF inputs.conf:

[http://<name>]
outputgroup = <string>
* The name of the forwarding output group to send data to.
* Default: empty string

example HF:
inputs.conf

[http]
busyKeepAliveIdleTimeout = 90
dedicatedIoThreads = 2
disabled = 0
enableSSL = 1
port = 8088
queueSize = 2MB


[http://metrics_data]
disabled = 0
host = hf1
index = mymetrics
indexes = mymetrics
sourcetype = _json
token = <token>
outputgroup = clustered_indexers_without_useACK

[http://event_data]
disabled = 0
host = hf1
index = main
indexes = main
sourcetype = _json
token = <token>


Since defaultGroup= clustered_indexers_with_useACK , if we don't specify any outputgroup the data gets sent to the default outputgroup so we dont need to declare it in the [http://event_data] input stanza.

 

 

View solution in original post

Tags (1)

rphillips_splk
Splunk Employee
Splunk Employee

note: HF is using useACK=true in outputs.conf which is causing the duplication of events.

root cause:

useACK is not available for events where the _raw field is missing. By design, metrics data does not contain an _raw field.
Do not use useACK=true if you are sending events to a metrics index where the event is missing _raw. useACK is implemented to track _raw. If _raw is not presented, there is no ACK sent back (from indexer to HF) which will cause duplicate events.


produce the problem:

send 1 json event to metrics index:

curl -k https://lb.sv.splunk.com:8088/services/collector \
-H "Authorization: Splunk <token>" \
-d '{"time": 1614193927.000,"source":"disk","host":"host_77","fields":{"region":"us-west-1","datacenter":"us-west-1a","rack":"63","os":"Ubuntu16.10","arch":"x64","team":"LON","service":"6","service_version":"0","service_environment":"test","path":"/dev/sda1","fstype":"ext3","_value":1099511627776,"metric_name":"total"}}'

wait 5m and search the metric index and see the duplicate event

| msearch index="mymetrics"


see the ACK timeout and duplication because indexers never send back ACK so the HF sends the event again and again every 300s.

ie:
HF splunkd.log:

02-24-2021 14:47:49.702 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:47:49.703 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997

02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Read operation timed out expecting ACK from 10.10.10.1:9997 in 300 seconds.
02-24-2021 14:52:51.498 -0500 WARN TcpOutputProc - Possible duplication of events with channel=source::disk|host::host_xx-withACK|_json|, streamId=0, offset=0 on host=10.10.10.1:9997

 


solution:
1.) create two output groups on the HF, one for event data and one for metric data. For the outputgroup for metric data set useACK=false
ie: HF:
outputs.conf:

[tcpout]
defaultGroup = clustered_indexers_with_useACK

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false


2.) on specific http input stanzas where metric data is received, use outputgroup attribute to send the the outputgroup where useACK=false

HF inputs.conf:

[http://<name>]
outputgroup = <string>
* The name of the forwarding output group to send data to.
* Default: empty string

example HF:
inputs.conf

[http]
busyKeepAliveIdleTimeout = 90
dedicatedIoThreads = 2
disabled = 0
enableSSL = 1
port = 8088
queueSize = 2MB


[http://metrics_data]
disabled = 0
host = hf1
index = mymetrics
indexes = mymetrics
sourcetype = _json
token = <token>
outputgroup = clustered_indexers_without_useACK

[http://event_data]
disabled = 0
host = hf1
index = main
indexes = main
sourcetype = _json
token = <token>


Since defaultGroup= clustered_indexers_with_useACK , if we don't specify any outputgroup the data gets sent to the default outputgroup so we dont need to declare it in the [http://event_data] input stanza.

 

 

Tags (1)

rphillips_splk
Splunk Employee
Splunk Employee

deployment flow:
client--->load balancer--->HFs--->Indexers

0 Karma

rphillips_splk
Splunk Employee
Splunk Employee

note:
if you are trying to set the output group to your token via splunk web UI you may notice that no output groups show up.
Settings>Data>Inputs>HTTP Event Collect > New Token

You must configure the groups for disabled = false in outputs.conf for them to appear in the UI.

 

ie:

[tcpout]
defaultGroup = clustered_indexers_with_useACK
disabled = false

[tcpout:clustered_indexers_with_useACK]
server=idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = true
disabled = false

[tcpout:clustered_indexers_without_useACK]
server = idx1.splunk.com:9997,idx2.splunk.com:9997,idx3.splunk.com:9997
useACK = false
disabled = false

 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

What Is Splunk? Here’s What You Can Do with Splunk

Hey Splunk Community, we know you know Splunk. You likely leverage its unparalleled ability to ingest, index, ...

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...