Deployment Architecture

Attempt to thaw data slow and ended unexpectedly

michael_sleep
Communicator

Hello, we have a clustered index that basically has one master indexer and two child indexers. Our data moves to a frozen state after 3 months and we needed to run a report on 6 months worth of data. I moved buckets from the necessary index from the "frozendb" folder to the "thaweddb" folder and created a script to issue the following command to each bucket (bucket names being whatever the bucket was obviously):

splunk rebuild R:\splunkdb\sp_72_logs\thaweddb\rb_1469724228_1469715286_114_CACEB811-4B3C-4B60-AE46-A061185F4F10

This process took over 2 days and was still running when my Powershell session was abruptly ended. I was curious what people would recommend doing in this case or if anyone has ever noted the thawing process to be extremely slow. There doesn't appear to be a way for me to know which buckets are and aren't thawed or where the data actually ends to try and sort it out. Should I just run the process again and hope for the best? Will this duplicate the data? Why are rebuilds so slow?

0 Karma
1 Solution

michael_sleep
Communicator

The solution we used for this was actually to break it up into several Powershell scripts (5-10) and run them all concurrently. This didn't seem to impact performance on the indexer at all and each script ran at the same speed. So I guess if you need to thaw a few thousand buckets, do it in several concurrent scripts, unless you want to leave something running for several days.

View solution in original post

michael_sleep
Communicator

The solution we used for this was actually to break it up into several Powershell scripts (5-10) and run them all concurrently. This didn't seem to impact performance on the indexer at all and each script ran at the same speed. So I guess if you need to thaw a few thousand buckets, do it in several concurrent scripts, unless you want to leave something running for several days.

Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI! Discover how Splunk’s agentic AI ...

[Puzzles] Solve, Learn, Repeat: Dereferencing XML to Fixed-length events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...