Getting Data In

Splunk logging Driver Bringing Down the Entire Docker Swarm Cluster



We implemented collecting Docker logs using splunk logging driver, It pushes the docker logs very well and good. But we have a bigger problem now.

Let's consider my Splunk-Indexor is down while spinning up docker containers, those containers will not be able to establish the connection with Splunk-indexor machine. Now that's going to crash entire docker engine on the system and you will not be able to execute any of the docker commands in those machines, this will hang up the entire docker engine in the machine. To fix this I had to restart the VM, docker service restart is not helping.

How can I mitigate this error?

Is this the docker issue or the Splunk one?

Here is the swarm-stack file I'm using

version: '3'
    image: "${DOCKER_IMAGE_PATH}/worker:${RELEASE_TAG}"
      replicas: 3
      context: ../../
      dockerfile: ../Dockerfile-worker
      - "8083:3000"
       driver: splunk
          splunk-url: "${SPLUNK_URL}"
          splunk-token: "${SPLUNK_TOKEN}"
          splunk-insecureskipverify: "true"
          tag: "{{.Name}}/{{.ID}}"
          labels: "NEurope"
          env: "${TARGET_NAME}"

If the Splunk driver works like this, then I need to rebuild/restart Docker Containers each and every time if there is a restart on the Splunk server(Indexor)


0 Karma


Do you run your Splunk Indexer at the same Docker Swarm from where you are sending logs? Possible you want to separate infra and prod clusters.

It is unexpected that after Splunk Indexer restart you see crashes or hangs. This behavior is not expected and should be reported on docker repository

If you have only one Indexer - I would suggest you create a fleet of Splunk Heavy Weight Forwarders, see, that way when you will need to restart Splunk Cluster - you will be able to restart it one by one.

If you don’t mind paid solutions, I can suggest to use our solution for Monitoring and Logs Forwarding, where we implemented logs forwarding on top of default JSON logging driver, so we have no affect on Docker Swarm. Plus to that you will get application monitoring. You can find how to install our solution here you can try it for free, as our images have a built-in trial license.


No. I'm not running Splunk indexer machine on the swarm cluster, that is a stand-alone machine sitting outside the cluster.

I believe this is happening because we have some timeouts on the Splunk-indexer machine.

I noticed that I can see some timeout error on the docker engine logs, Is the docker is going to hang on each and every timeout?

Even if you set up a cluster with multiple heavy forwarders, that is not going to help, Because you may have timeout because of the network.

Please let me know if you have any thoughts...!

We are already in a process to procure Splunk, at this moment we don't have direct support.


0 Karma


Having multiple indexer will help with the indexer availability, but will not solve the networking problem. You can also have Heavy Weight Forwarders installed on the same node, so you will not have networking issues anymore. And that forwarders will send data to indexers, when they are available.

The hang you are experiencing is unexpected, and I assume that it is possible that Splunk Logging Driver does not set the read timeout, and the connection is just getting disconnected from one end but does not close it on Splunk Logging Driver, so it indefinitely waits for a response. It does not seem like Splunk Logging Driver sets the ReadTimeout to the http.Client, so you can send a PR to add a timeout

That should solve this problem partially.

But again, I will suggest you take a look on our solution, as our log forwarding does not depend on Splunk log driver, you will write the logs in JSON, our collector tails JSON logs and forwards them to Splunk. We have a free trial for 30 days. Give a try, send us an email to to learn more, we can schedule a call and discuss all the issues you experience.

0 Karma
Get Updates on the Splunk Community!

Get ready to show some Splunk Certification swagger at .conf24!

Dive into the deep end of data by earning a Splunk Certification at .conf24. We're enticing you again this ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Now On-Demand Join us to learn more about how you can leverage Service Level Objectives (SLOs) and the new ...

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...