About pharmapartners

pharmapartners · ‎06-26-2024

My issue is solved I manually copied the files from colddb on the recovered node to the colddb location on the production nodes. (enable maintenance-mode, stop splunk on receiving node, copy files, make sure ownership is correct, start splunk, disable maintenance-mode) The recovered node is currently still in the cluster, because removing it would fill up the remaining indexers a bit too much.. which would lead to data loss again 🙂 When we added some additional RHEL9 peer nodes we will remove the recovered node and life will be good again. Thanks for the tips and clearifying info.

pharmapartners · ‎06-25-2024

Did a quick check on the files in the colddb directory. There are 4 different GUID's, which are actually the same as the GUID's for the existing peers. (which makes sense, since I used the original /opt/splunk/etc on the new RHEL9 nodes, which includes the instance.cfg holding the GUID) $ ls -lrt colddb/ | awk -F_ '{print $5}' | sort -u 10B29386-EAD3-45F6-AFEF-6C5897D7507E 289FAAF8-810C-454E-9CF5-4DEA9C5CA3E7 332E50AC-2BE6-4FFB-96AB-3F7D612A1422 9C46DD6F-782E-4675-8E9B-90CABC42221D And the current peers : $ splunk list cluster-peers | grep -v ":" | grep [0-9] 10B29386-EAD3-45F6-AFEF-6C5897D7507E 289FAAF8-810C-454E-9CF5-4DEA9C5CA3E7 332E50AC-2BE6-4FFB-96AB-3F7D612A1422 42C49D52-0A71-4164-91EC-806EAEEEE085 9C46DD6F-782E-4675-8E9B-90CABC42221D (The 42C49... GUID is from the restored node, holding all the cold buckets)

pharmapartners · ‎06-25-2024

Aha... that makes sense ... and explains a lot. I will see if I can restore the cold buckets by renaming the files / setting the correct GUID in the instance.cfg on the restored node. Thanks a lot for pointing me in the right direction.

pharmapartners · ‎06-25-2024

What we did was : Restored 2 old peer nodes from a backup Cloned the master node to setup a shadow cluster and adapted the replication-factor on this clone to 2. This allowed us to make a mini-cluster which is fully balanced (so both restored peer nodes would have all data) I did however noticed that on one of the two recoved nodes the colddb-location remained empty. Placed the shadow-cluster in maintenance and removed one of the peer nodes. Reconfigured this peer to connect to the production cluster Also changed the name in the server.conf and removed the instance.cfg to prevent duplicate peer names and UUID's When I check the "Settings / Indexer Clustering" page on the master it does show the recovered node as well. The "Indexes" tab on this same page shows all indexes are green. But... when I do a search for the earliestTime, the older data which is on the recovered peer is not seen. Only when I add the recovered peer to the distSearch.conf it does see the older events. Also when I remove the recovered peer again from the cluster the older events are also gone again, which indicates those cold buckets were not synced to the production nodes. The buckets are not rolled to frozen, because the frozenTimePeriodInSecs for the index is set to 157248000 (about 5 years) and the data I try to recover is from 2020. And I did just run a dbinspect and it seems not to give any errors on the cold buckets on the restored host. Path is the colddb-path and state is 'cold' as expected Eventually I would like to remove the recovered peer again from the cluster, since this is still running RHEL7 and it has to be switched off.. So... I am looking for a way to safely get the data on the RHEL9 nodes. And as a side-track I want to get the understanding of how the warm/cold buckets are handled. Because... when they are indeed not replicated it also explains why they were lost in the first place... the RHEL9 nodes were clean installations which replaced the RHEL7 nodes. The rough procedure followed in this migration was : Add an additional "overflow" peer to the cluster and make sure the cluster is synced. Bring down (offline --enforce-counts) one of the RHEL7 nodes and replace it with a clean RHEL9 node. Config from /opt/splunk/etc was taken over from the old RHEL7 node When all nodes were replace, the "overflow" node was removed. So, when cold buckets were not replicated, the were never replicated to the overflow node and eventually were all gone..

pharmapartners · ‎06-25-2024

Recently we replace our RedHat 7 peers with new RedHat 9 peers and it seems we lost some data in the process... Looking at the storage, it almost seems like we lost the cold buckets (and maybe also the warm ones). We managed to restore a backup of one of the old RHEL7 peers and we connected this to the cluster, but it looks like it's not replicating the cold buckets to the RHEL9 peers.. We are not using smart storage, the cold buckets are in fact just stored in another subdir under the $SPLUNK_DB path. So.. the question rises... are warm and cold buckets replicated ? Our replication factor is set to 3 and I added a single restored peer to a 4-peer cluster If there is no automated way of replicating the cold buckets... can I safely copy them from the RHEL7 node to the RHEL9 nodes ? (e.g. via scp)

pharmapartners · ‎06-26-2023

That did the trick indeed. Thanks a lot.

pharmapartners · ‎06-26-2023

We are running splunk 9.0.5 We want to add an index to the default indexes for a user role, but the index does not show up in the list of indexes in the "Edit User Role" window, tab "Indexes" on the search head There is data in the index and we do see the index in the monitoring console under Indexing / Index Detail:Deployment We did also add the following to the /opt/splunk/etc/system/local/server.conf on the search head : [introspection:distributed-indexes] disabled = false (And restarted the splunk service on the search head afterwards) The index was created earlier (before 9.0.5) via the master node file /opt/splunk/etc/master-apps/_cluster/local/indexes.conf (now moved to manager_apps) A push of the bundle did not make any changes (peers already had the correct version) What else could be the issue here ?

Posts	7
Solutions	0
Karma Given	1
Karma Received	1
Member Since	‎12-16-2014

Online Status	Offline
Date Last Visited	‎06-26-2024 08:26 AM

Are cold buckets replicated when I add an old peer...

Why is index missing in list of indexes while edit...

Re: Are cold buckets replicated when I add an old ...

Re: Are cold buckets replicated when I add an old ...

Re: Are cold buckets replicated when I add an old ...

Re: Are cold buckets replicated when I add an old ...

Are cold buckets replicated when I add an old peer...

Re: Index missing in list of indexes while editing...

Why is index missing in list of indexes while edit...

Are you a member of the Splunk Community?