Deployment Architecture

Are rolling buckets moved or copied?

alekksi
Communicator

Hi all,

By what means are buckets rolled? If, for example, a bucket is rolled from warm to cold, is that bucket copied or moved, in terms of using Linux?

We have some potential issues with storage performance during rolling restarts of the indexer cluster and we're looking to minimise this where possible, so it would be very useful to learn the what the BucketMover mechanism is actually doing under the covers.

Any help would be appreciated!

Thanks and regards,
Alex

0 Karma
1 Solution

alekksi
Communicator

Buckets are copied, verified, then deleted. Having hot and cold on the same partition has no performance benefits.

View solution in original post

0 Karma

jhatch_splunk
Splunk Employee
Splunk Employee

An old thread, but thought it was worth a clarification.

If buckets are on the same volume, or partition, then we simply perform a rename via mv or move. In fact we always attempt to rename first, but if a bucket is moving across partitions then we fall back to copydelete - which means we'll perform a recursive copy (via a buffer) before deleting the source bucket once this succeeds.

With frozen buckets it depends on whether they're being deleted (the default), or if you are archiving them - if archived then it will depend on what your archival script is doing. The sample we provide with Splunk deletes everything except the rawdata and then performs a copy to the frozen path.

0 Karma

alekksi
Communicator

Buckets are copied, verified, then deleted. Having hot and cold on the same partition has no performance benefits.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

HOT and WARM buckets are in the same directory, so it doesn't make sense to copy them. WARM to COLD would depend on your implementation, but most would put COLD on different hardware, which would mean copying is necessary. It doesn't make much sense to have COLD on the same devices as WARM. You would just keep the WARM buckets WARM if they were on the same devices. That being said, an mv across devices is the same as a cp.

alekksi
Communicator

Sorry, to clarify further, I am not manually moving or copying buckets as of yet. I wouldn't use either mv or cp to do this, but instead rsync.

What I need to know is what the underlying OS mechanism used by Splunk is, in order for the data to go from one storage location to the other? If it's copy/delete, it will have different performance impacts on the underlying storage than if a move is used.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

I don't know that exact method that Splunk uses, but, as I said, if it goes across devices (which is generally how warm vs cold is done, since cold is usually on cheaper hardware), it can't be moved, it has to be copied. If you are putting your cold on difference devices than the warm and hot, then it will definitely be copied from one device to another. There is no other option. What is your hardware configuration like for the cold vs warm? Is it the same device, and if so, why have the cold data in the first place? It won't buy you any advantage to have it on the same hardware.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...