We suddenly got a couple thousand SHCArtifactId errors. Essentially the messages say 'The artifact blah contains the GUID foo. This GUID does not match the member's current GUID.'
What exactly is this telling me and how does a member's GUID get changed?
We're on version 6.5.2 Enterprise.
I have the same messages in my log on the SHC captain.
02-14-2017 07:43:04.655 +0100 ERROR SHCArtifactId - The artifactID "rt_user_c3BsdW5rX21ndF9tb24__appname__RMD56c49e503d8c06396_at_1487052704_53_C8DDA17B-D35A-49E1-BBEE-93655CCE6102" contains the GUID "C8DDA17B-D35A-49E1-BBEE-93655CCE6102". This GUID does not match the member's current GUID, "69204358-A333-41AD-BC4B-B531FB108AFE". This error can occur if you changed the member's GUID. If you did change the GUID, you must remove the member from the cluster, clean it, and add it back into the cluster. See the Distributed Search manual for a procedure to accomplish these steps.
I tried to find this artifactID in the dispatch folder, but it's not there.
The GUID in the artifactID is from another SHC member (splunksh2), which is online and fully functional, see shcluster-status below ( yes there is no splunksh3 at the moment, never has been in this current setup)
Captain: dynamic_captain : 1 elected_captain : Tue Feb 14 07:06:20 2017 id : 7D31EBEB-5B88-45B3-B85C-059563FB2002 initialized_flag : 1 label : splunksh1 mgmt_uri : https://splunksh1:8089 min_peers_joined_flag : 1 rolling_restart_flag : 0 service_ready_flag : 1 Members: splunksh1 label : splunksh1 mgmt_uri : https://splunksh1:8089 mgmt_uri_alias : https://splunksh1:8089 status : Up splunksh2 label : splunksh2 last_conf_replication : Tue Feb 14 07:45:29 2017 mgmt_uri : https://splunksh2:8089 mgmt_uri_alias : https://10.10.10.90:8089 status : Up splunksh4 label : splunksh4 last_conf_replication : Tue Feb 14 07:45:28 2017 mgmt_uri : https://splunksh4:8089 mgmt_uri_alias : https://10.10.10.97:8089 status : Up
Is the search head one that you tried to add in (and then got this message), or one that was already working in the cluster that this error just came up for no apparent reason?
It's one that is already working in the cluster, and it still works, it's getting jobs delegated from the captain and everything.
Also there was no search head added after the initial SHC formation.
This is an issue that will be fixed in Splunk Enterprise 6.5.3. Rest assured, your GUID's has not been changed, the root
cause of the error message does not reflect the actual issue. You can set the SHCArtifactId thread to log-level FATAL to suppress the messages as a workaround alternatively modify your alert to filter out errors for SHCArtifactId category.
/opt/splunk/bin/splunk set log-level SHCArtifactId -level FATAL