All Apps and Add-ons

Splunk Kafka: Why the error connect NOT_ENOUGH_REPLICAS?

johnl317
New Member

Hi all,

I have a Splunk Kafka connect which i installed from the github.
I started the Kafka connect after i changed the config/connect-distributed.properties and edited the bootstrap.servers to the correct ones (A cluster of 3 Kafka servers).
and then i added with a curl command a topic to monitor and the Splunk Kafka connect started throwing a exception of:

 

 

[2020-11-17 13:08:34,095] WARM [Producer cliendId=producer-3] Got error produce response with correlation id 410 on topic-partiation _kafka-connect-splunk-task-config-0, retrying (214748341 attempts left). Error : NOT_ENOUGH_REPLICAS (org.apach.kafka.clients.producer.internals.Sender:525)

 

 

 

the relevant part of the file config/connect-distributed.properties is:

 

 

group.id=kafka-connect-splunk-hec-sink
config.storage.topic=_kafka-connect-splunk-task-configs
config.storage.replication.factor=3

offset.storage.topic=_kfaka-connect-splunk-offsets
offset.storage.replication.factor=3
offset.storage.partitions=25

status.storage.topic=_kafka-connect-splunk-statuses
status.storage.replication.factor=3
status.storage.partitions=5

 

 


And the curl looks like:

 

 

curl localhost:8083/connectors -X POST -H "Content-Type: application/json" -d '{
"name": "kafka-connect-splunk",
"config": {
"connector.class": "com.splunk.kafka.connect.SplunkSinkConnector",
"tasks.max": "3",
"topics":"Topic_name",
"splunk.indexes": "",
"splunk.hec.uri": "https://splunk:8088",
"splunk.hec.token": "Token",
"splunk.hec.raw": "true",
"splunk.hec.raw.line.breaker": "",
"splunk.hec.ack.enabled": "true",
"splunk.hec.ack.poll.interval": "10",
"splunk.hec.ack.poll.threads": "2",
"splunk.hec.ssl.validate.certs": "false",
"splunk.hec.http.keepalive": "true",
"splunk.hec.max.http.connection.per.channel": "4",
"splunk.hec.total.channels": "8",
"splunk.hec.max.batch.size": "1000000",
"splunk.hec.threads": "2",
"splunk.hec.event.timeout": "300",
"splunk.hec.socket.timeout": "120",
"splunk.hec.track.data": "true"
}
}'

 

 

 

Would like to get any assistance to what to check? and what is the best practice for Splunk Kafka connect configuration?

Thank you in advance,
John

Labels (2)
Tags (2)
0 Karma

jamie00171
Communicator

Hi @johnl317 

When you use curl / the REST API to create a connector the worker (Kafka connect process) at which the request arrives at forwards that request onto the leader worker. At that point the leader uses a Kafka producer to update the internal task config in Kafka connects internal config.storage.topic, this is where your error is occurring. 

If you have a cluster of 3 brokers I suspect the problem is with replication in the cluster. Since the producer is internal to Kafka Connect it sets "acks" equal to all:  https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/st... 

and as far as I can see it sets min.insync.replicas at the default of 1 so as long as you had one replica "in sync" then you should be able to produce to the topic. From the error message I'd suspect that actually no replicas were in sync for the partition the producer was trying to send the connector configuration to.

You could confirm this by running something like the following on one of the Kafka brokers:

/bin/kafka-topics.sh --zookeeper $ZK_connection_string --describe --topic _kafka-connect-splunk-task-config

Note - this command might need to change slightly depending on the version of Kafka you are running.

For questions like this if might be worth asking the users@kafka.apache.org mailing list.

Thanks, 

Jamie

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...