Hi everyone,
i've currently deployed the following instances in my Splunk infrastructure using Splunk 8.1.0:
- 1 Search Head
- 1 Cluster Master
- 2 Indexers in cluster
- 2 Heavy Forwarders
Everything seems to work fine except for Cluster Master. Since i added the 2 Indexers to the cluster, the following messages are repeated in splunkd.log on Cluster Master system:
10-28-2020 16:59:54.703 +0100 WARN Fixup - GenCommitFixup::finish error in scheduler sendQueued=
10-28-2020 16:59:54.704 +0100 WARN CMMaster - Unable to send scheduled jobs, err=""
10-28-2020 16:59:55.202 +0100 DEBUG CMMaster - event=serviceHeartbeats size=1
10-28-2020 16:59:55.202 +0100 DEBUG CMMaster - event=setPeerStatus Skipping since peer=AAAAA peer_name=splunk-indexer-2 is already in status=Up reason=heartbeat received.
10-28-2020 16:59:55.202 +0100 DEBUG CMMaster - event=serviceRecreateIndexJobs No indexes to be recreated
10-28-2020 16:59:55.203 +0100 DEBUG CMMaster - event=serviceRecreateBucketJobs No buckets to be recreated
10-28-2020 16:59:55.203 +0100 WARN Fixup - GenCommitFixup::finish error in scheduler sendQueued=
10-28-2020 16:59:55.203 +0100 WARN CMMaster - Unable to send scheduled jobs, err=""
10-28-2020 16:59:55.703 +0100 DEBUG CMMaster - event=serviceHeartbeats size=1
10-28-2020 16:59:55.704 +0100 DEBUG CMMaster - event=setPeerStatus Skipping since peer=BBBBB peer_name=splunk-indexer-1 is already in status=Up reason=heartbeat received.
10-28-2020 16:59:55.704 +0100 DEBUG CMMaster - event=serviceRecreateIndexJobs No indexes to be recreated
10-28-2020 16:59:55.704 +0100 DEBUG CMMaster - event=serviceRecreateBucketJobs No buckets to be recreated
Can you help me with this issue?
Other useful information that may help:
Cluster Master server.log
[general]
serverName = splunk-clustermaster-1
pass4SymmKey = pass
[sslConfig]
sslPassword = pass
enableSplunkdSSL = true
serverCert = /opt/splunk/etc/auth/server.pem
sslRootCAPath = /opt/splunk/etc/auth/cacert.pem
[lmpool:auto_generated_pool_download-trial]
description = auto_generated_pool_download-trial
quota = MAX
slaves = *
stack_id = download-trial
[lmpool:auto_generated_pool_forwarder]
description = auto_generated_pool_forwarder
quota = MAX
slaves = *
stack_id = forwarder
[lmpool:auto_generated_pool_free]
description = auto_generated_pool_free
quota = MAX
slaves = *
stack_id = free
[clustering]
cluster_label = cluster
mode = master
pass4SymmKey = pass
replication_factor = 2
[indexer_discovery]
pass4SymmKey = pass
Indexers server.conf
[general]
serverName = splunk-indexer-1
pass4SymmKey = pass
parallelIngestionPipelines = 2
pipelineSetSelectionPolicy = weighted_random
[sslConfig]
enableSplunkdSSL = true
sslPassword = pass
sslRootCAPath = /opt/splunk/etc/auth/cacert.pem
serverCert = /opt/splunk/etc/auth/server.pem
[lmpool:auto_generated_pool_download-trial]
description = auto_generated_pool_download-trial
quota = MAX
slaves = *
stack_id = download-trial
[lmpool:auto_generated_pool_forwarder]
description = auto_generated_pool_forwarder
quota = MAX
slaves = *
stack_id = forwarder
[lmpool:auto_generated_pool_free]
description = auto_generated_pool_free
quota = MAX
slaves = *
stack_id = free
[replication_port-ssl://8080]
serverCert = /opt/splunk/etc/auth/server.pem
sslPassword = password
[clustering]
master_uri = https://clustermaster:8089
mode = slave
pass4SymmKey = pass
Thank you!
This issue seems to be fixed in Splunk Enterprise 8.1.2.
thats correct. Issue is gone after updating to 81.2
Is this issue resolved? We are facing the same issue that CM is unable to schedule rep jobs to indexers post upgrading to 8.1.1. We are seeing lot of Warning with respect to GenCommitFixup Errors and Unable to send Scheduled jobs with err="". We have verified the network ports and firewalls everything is working fine from network end.
So yeah, splunk support says its a known bug. Most likely will be fixed in next release. its non impacting error but if its annoying than downgrade your splunk to 8.0. Here's the snippet from my support ticket.
I have been checking this issue with a couple of colleagues and it seems you are hitting a known bug that we have internally which is the SPL-197830 we are still working on a solution for that and it's currently high priority, however, it seems that the resolution will come out on version 8.1.2.
According to the notes of the internal case the "errors" are false positives, so there isn't much to worry. I know you are getting flooded with the errors and a temporarily solution could be do downgrade to a version 8.0.X.
We haven't seeing that error on that version yet but currently that is our best option. If you are planning to wait, I will suggest you to call to Splunk Hot line and ask for the internal case SPL-197830 just to make sure everything was fixed.
Regards
SR
Thanks for sharing but in my opinion it has an impact on the fixup/recovery progress/speed-
As @berlincount already mentioned:
and it's putting the brakes onto Cluster Data Rebalances massively
We see the same behaviour on our bigger cluster as sf fixup task took significantly longer as they did pior to the update.
We will open a case on our behalf as well and i well share the conclusion here.
Thanks @sramiz for the update.
@claudio_manig Agree with you on the fixup activities part! We are also still waiting to get the response from the Splunk Support.
Hello,
I have opened up a case with splunk and support team asked me to upgrade my splunk version to 8.1.1, but as you noticed it too, it didn't work. I am waiting to hear from splunk on next steps.
regards,
SR
Any news on this case?
Facing the same issue after upgrading from 7.3x to 8.1.1
having same issue on v8.1.0
Same here, with a notably bigger cluster - and it's putting the brakes onto Cluster Data Rebalances massively.
I believe the maintenance release 8.1.0.1 might fix this, per the release notes:
https://docs.splunk.com/Documentation/Splunk/8.1.0/ReleaseNotes/Fixedissues
2020-10-27 | SPL-196757, SPL-197069, SPL-197599 | Upgrading Cluster Master to 8.1 with indexer discovery enabled stops CM forwarding its logs |
Just installed 8.1.0.1 (test environment 7.3.3 > 8.1.0.1)
It doesn't fix it.
Log is still filling with " WARN Fixup - GenCommitFixup::finish error in scheduler sendQueued= "
Did some more testing. Issues evens pops-up, with a clean install.
installed 2 machines, just clean install. As soon de indexer connects to the clutser-master the Warning pops-up.
firewall is disabled
Seconded, I just upgraded a clustered environment from 8.1.0 which was experiencing the issue to 8.1.0.1 and the errors are still being shown in the messages log.
Seeing this behavior on the CM in my environment as well. Upgraded from 7.3.3 to 8.1.0.
Searches running properly from the SHC & other independent SHs, CM can search as well, indexing from UFs and HFs is working properly.
Additionally, the proper ports are opened up.
Make sure all these ports on your nodes are open and accessible.
8000 8065 8088 8089 8191 9887 9997
I'm seeing this error on my cluster masters since upgrading to 8.1.0
Would love to see a fix!
I am seeing this on 8.1.1.
Are you able to run searches from CM?
I can run searches with both Search Head and Cluster Master on _internal index for example, main index is empty for now.