Getting Data In

Splunk Plugin for Jenkins (DevOps): Why has the plugin stopped working on one of my cluster masters?

hal_boggess
Explorer

Splunk (6.4.2) large cluster.

Splunk Plugin for Jenkins 1.3.1

I have the Splunk plugin on 4 Jenkins masters. One of the masters stopped sending data on Sunday (14 days since last restart of Jenkins) and I can't establish the connection again. The other 3 masters are still working and my curl test to the HTTP collector works from all 4 masters.

Jenkins log entry from working master.

Dec 06, 2016 10:44:01 AM com.splunk.splunkjenkins.utils.LogConsumer run
        at com.splunk.splunkjenkins.utils.LogConsumer.run(LogConsumer.java:84)

Jenkins log from master that is not working.

Dec 06, 2016 3:50:06 AM com.splunk.splunkjenkins.utils.LogConsumer run
        at com.splunk.splunkjenkins.utils.LogConsumer$1.handleResponse(LogConsumer.java:63)
        at com.splunk.splunkjenkins.utils.LogConsumer$1.handleResponse(LogConsumer.java:43)
        at com.splunk.splunkjenkins.utils.LogConsumer.run(LogConsumer.java:84)

Looking for something to try without restarting Jenkins (Its a critical production master)

0 Karma

GnanasekarP
New Member

I was able to get SSH access to the system and did the following to solve my problem. Solution taken from the [Removing and Disabling][1] Plugins wiki page.

touch /var/lib/jenkins/plugins/splunk-devops.jpi.disabled
touch /var/lib/jenkins/plugins/splunk-devops-extend.jpi.disabled

Then, I rebooted. From the documentation, I assume that this is possible with any plugin.
You can get more information here! https://wiki.jenkins-ci.org/display/JENKINS/Removing+and+disabling+plugins & Also in Mindmajix.com

0 Karma

txiao_splunk
Splunk Employee
Splunk Employee

The error indicated that http event collector is out of service temporarily, mostly caused by blocked queues, the blocked reason can be found via query

index=_internal blocked

Or via grep

grep blocked $SPLUNK_HOME/var/log/splunk/metrics.log

there is a wiki page for troubleshooting https://wiki.splunk.com/Community:TroubleshootingBlockedQueues

0 Karma

hal_boggess
Explorer

Came in this morning and Jenkins/Splunk logging had stopped again after working OK for about a week on the same master.

java.io.IOException: failed to send data,Service Unavailable

When I test the connection I get:

token:xxxxxxxxxxxxxxxxxxxxxxxx response:Service Unavailable

Still working from other 3 masters.

0 Karma

txiao_splunk
Splunk Employee
Splunk Employee

There should be more verbose error message about the reason, can you lookup the logs for something like
"failed to send data, reason phase".
If the reason is something like "connect reset", can you try increasing the number of "Retries on Error" on Advance section.

if the reason is "service unavailable" and you are using heavy forwarder to forward data to splunk servers across WAN, please adjust maxQueueSize in outputs.conf, see also https://docs.splunk.com/Documentation/Splunk/6.4.2/Admin/Outputsconf.

0 Karma

txiao_splunk
Splunk Employee
Splunk Employee

@hal.boggess do you have chance to check the error details?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...