Solved: Are rolling buckets moved or copied?

alekksi · ‎05-19-2017

Hi all,

By what means are buckets rolled? If, for example, a bucket is rolled from warm to cold, is that bucket copied or moved, in terms of using Linux?

We have some potential issues with storage performance during rolling restarts of the indexer cluster and we're looking to minimise this where possible, so it would be very useful to learn the what the BucketMover mechanism is actually doing under the covers.

Any help would be appreciated!

Thanks and regards,
Alex

alekksi · ‎06-21-2017

Buckets are copied, verified, then deleted. Having hot and cold on the same partition has no performance benefits.

View solution in original post

jhatch_splunk · ‎09-12-2019

An old thread, but thought it was worth a clarification.

If buckets are on the same volume, or partition, then we simply perform a rename via mv or move. In fact we always attempt to rename first, but if a bucket is moving across partitions then we fall back to copydelete - which means we'll perform a recursive copy (via a buffer) before deleting the source bucket once this succeeds.

With frozen buckets it depends on whether they're being deleted (the default), or if you are archiving them - if archived then it will depend on what your archival script is doing. The sample we provide with Splunk deletes everything except the rawdata and then performs a copy to the frozen path.

alekksi · ‎06-21-2017

Buckets are copied, verified, then deleted. Having hot and cold on the same partition has no performance benefits.

cpetterborg · ‎05-19-2017

HOT and WARM buckets are in the same directory, so it doesn't make sense to copy them. WARM to COLD would depend on your implementation, but most would put COLD on different hardware, which would mean copying is necessary. It doesn't make much sense to have COLD on the same devices as WARM. You would just keep the WARM buckets WARM if they were on the same devices. That being said, an mv across devices is the same as a cp.

alekksi · ‎05-22-2017

Sorry, to clarify further, I am not manually moving or copying buckets as of yet. I wouldn't use either mv or cp to do this, but instead rsync.

What I need to know is what the underlying OS mechanism used by Splunk is, in order for the data to go from one storage location to the other? If it's copy/delete, it will have different performance impacts on the underlying storage than if a move is used.

cpetterborg · ‎05-22-2017

I don't know that exact method that Splunk uses, but, as I said, if it goes across devices (which is generally how warm vs cold is done, since cold is usually on cheaper hardware), it can't be moved, it has to be copied. If you are putting your cold on difference devices than the warm and hot, then it will definitely be copied from one device to another. There is no other option. What is your hardware configuration like for the cold vs warm? Is it the same device, and if so, why have the cold data in the first place? It won't buy you any advantage to have it on the same hardware.

Are rolling buckets moved or copied?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!