Hi,
I have a Kafka cluster running, and periodically, the active controller fails. This causes issues with the Splunk sink connector and therefore stops the process of streaming audit data from Cloudera to Splunk.
[2019-02-21 14:54:10,672] INFO add 1 failed batches (com.splunk.kafka.connect.SplunkSinkTask:322)
[2019-02-21 14:54:10,672] INFO total failed batches 263 (com.splunk.kafka.connect.SplunkSinkTask:47)
[2019-02-21 14:54:10,672] ERROR failed to send batch (com.splunk.kafka.connect.SplunkSinkTask:261)
com.splunk.hecclient.HecException: All channels have back pressure
at com.splunk.hecclient.LoadBalancer.send(LoadBalancer.java:62)
at com.splunk.hecclient.Hec.send(Hec.java:233)
at com.splunk.kafka.connect.SplunkSinkTask.send(SplunkSinkTask.java:257)
at com.splunk.kafka.connect.SplunkSinkTask.handleFailedBatches(SplunkSinkTask.java:127)
at com.splunk.kafka.connect.SplunkSinkTask.put(SplunkSinkTask.java:62)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:495)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have restarted and reset the sink connector configuration. It is still spitting out this error after an hour.
NOTE: This setup has been working for almost 2 weeks. It has only started doing this today. It did do this about 5 days ago. It came up with the same error and then corrected itself.
Thanks!
ANSWER:
The problem was the Splunk server was down. Had not received a notice that is was going to be worked on and I assumed the problem was on my side.