Deployment Architecture

Splunk indexes to external storage.

kn450
Explorer

Dear Splunk Community,

I’m currently facing an urgent issue in my Splunk environment: my storage utilization has reached 95%, which threatens system continuity and performance. I plan to move older data to external storage before it’s too late, but I haven’t yet implemented a bucket‐policy to automate time-based data retention.

I would greatly appreciate your expertise on:

  • Best practices for safely and efficiently migrating old data from my current Splunk indexes to external storage.

  • Recommended scripts or Splunkbase apps that facilitate this process.

  • How to ensure continued access to the migrated data when needed, without impacting search performance.

  • Any additional suggestions, practical examples, or links to detailed documentation.

Thank you in advance for your time and assistance.

Kind regards,

Tags (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

First things first.

1. You posted this in Splunk SOAR section of Answers but the question seems to be about Splunk Enterprise. I'll move this thread to appropriate section but please try to be careful about where you post - the sections are there so that we keep the forums tidy and make it easier to find answers to your problems.

2. We have no idea if you have a standalone installation or clustered indexers. A cluster involves way more work to do what you're asking about and needs special care not to break it.

3. While moving indexes around is possible after their initial creation, it's a risky operation if not done properly and therefore I'd advise against attempting it by an inexperienced admin.

You have been warned.

One more very important thing - what do you mean by "external storage"? If you plan on moving some of your indexes onto some CIFS or NFS (depending on the system your Splunk runs on) share, forget it. This type of storage can be used for storing frozen buckets but not for searchable data.
And now the second warning. You have been warned twice.

Since Splunk's indexes are "just" directories on a disk there are two approaches to task of moving the data around.

Option one - stop your Splunk, move the hot/warm and/or cold directories to another directory and adjust the index definition in indexes.conf accordingly, start Splunk

Option two - stop your Splunk, move the hot/warm and/or cold directories to another directory and make the OS see the new location under the old location (using bind mount, symlink, junction - depending on the underlying OS), start Splunk.

In case of clustered indexers you must go with the second option, at least until you've moved data in all indexers, because all indexers must share the same config so you can't reconfigure just some of the indexers.

Option two is the only way to go if you wanted to move just some part of your buckets (like oldest half of cold buckets) but this is something even I wouldn't try to do in production. You have been warned thrice!

Having said that - it would probably be way easier to attach an external storage as frozen storage and make Splunk rotate the older buckets there. Of course frozen buckets are not searchable so from Splunk's point of view they are effectively deleted. (And handling frozen buckets in a cluster can be tricky if you want to avoid duplicated frozen buckets).

Another option (but that completely messes with your overall architecture) would be to use a remote S3-compatible storage and define smartstore-backed indexes but it is a huge ovehrhaul of your whole setup and while in some cases it can help, in others it can cause additional problems so YMMV.

View solution in original post

0 Karma

PickleRick
SplunkTrust
SplunkTrust

First things first.

1. You posted this in Splunk SOAR section of Answers but the question seems to be about Splunk Enterprise. I'll move this thread to appropriate section but please try to be careful about where you post - the sections are there so that we keep the forums tidy and make it easier to find answers to your problems.

2. We have no idea if you have a standalone installation or clustered indexers. A cluster involves way more work to do what you're asking about and needs special care not to break it.

3. While moving indexes around is possible after their initial creation, it's a risky operation if not done properly and therefore I'd advise against attempting it by an inexperienced admin.

You have been warned.

One more very important thing - what do you mean by "external storage"? If you plan on moving some of your indexes onto some CIFS or NFS (depending on the system your Splunk runs on) share, forget it. This type of storage can be used for storing frozen buckets but not for searchable data.
And now the second warning. You have been warned twice.

Since Splunk's indexes are "just" directories on a disk there are two approaches to task of moving the data around.

Option one - stop your Splunk, move the hot/warm and/or cold directories to another directory and adjust the index definition in indexes.conf accordingly, start Splunk

Option two - stop your Splunk, move the hot/warm and/or cold directories to another directory and make the OS see the new location under the old location (using bind mount, symlink, junction - depending on the underlying OS), start Splunk.

In case of clustered indexers you must go with the second option, at least until you've moved data in all indexers, because all indexers must share the same config so you can't reconfigure just some of the indexers.

Option two is the only way to go if you wanted to move just some part of your buckets (like oldest half of cold buckets) but this is something even I wouldn't try to do in production. You have been warned thrice!

Having said that - it would probably be way easier to attach an external storage as frozen storage and make Splunk rotate the older buckets there. Of course frozen buckets are not searchable so from Splunk's point of view they are effectively deleted. (And handling frozen buckets in a cluster can be tricky if you want to avoid duplicated frozen buckets).

Another option (but that completely messes with your overall architecture) would be to use a remote S3-compatible storage and define smartstore-backed indexes but it is a huge ovehrhaul of your whole setup and while in some cases it can help, in others it can cause additional problems so YMMV.

0 Karma

kn450
Explorer

Thank you, it worked.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @kn450 

To address high storage utilization by moving older Splunk data, the recommended approach involves configuring data retirement policies. Manually moving buckets is generally discouraged due to complexity and risk.

  1. Implement Data Retention Policies: Configure your indexes.conf file to automatically manage data lifecycle (hot -> warm -> cold -> frozen). Set frozenTimePeriodInSecs to define when data should be considered frozen. Data in the frozen state is typically deleted by Splunk, but you can configure a script (coldToFrozenScript) to move it to external storage instead, or coldToFrozenDir for a frozen path on additional storage. However, searching this manually moved frozen data requires restoring / thawing before being searchable again by Splunk.
  2. Immediate Action (Use with Caution): If space is critical now and retention policies aren't configured:
    • Identify the oldest cold buckets ($SPLUNK_DB/<index_name>/colddb/*).
    • Backup these buckets first.
    • Manually move the oldest cold buckets to external storage. This frees up space but makes the data unsearchable by Splunk unless restored.
    • Alternatively, if data loss is acceptable for the oldest data, adjust frozenTimePeriodInSecs to a shorter duration and restart Splunk; it will begin freezing (and potentially deleting, depending on configuration) older data. This is irreversible if deletion is enabled.
  3. Accessing Migrated Data:

Splunk manages data through buckets representing time chunks. These buckets transition from hot (actively written), to warm (read-only), to cold (read-only, potentially moved). The final state is frozen, where Splunk expects the data to be archived or deleted based on indexes.conf settings. Manually moving buckets breaks this native searchability. For more info check out https://docs.splunk.com/Documentation/Splunk/9.4.1/Indexer/Automatearchiving

Top Tips

  • Backup: Always back up data before manually moving or deleting buckets.
  • Configuration: Properly configuring indexes.conf (especially homePath, coldPath, thawedPath, maxTotalDataSizeMB, frozenTimePeriodInSecs) is crucial for managing storage automatically.
  • Manual Migration Risk: Manually moving buckets is error-prone and complex to manage, especially for searching. It should be a last resort or temporary measure.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

kn450
Explorer

Thank you, it worked.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...