Solved: In a Splunk cluster on Cloud, what are some best p...

LordLeet · ‎10-02-2018

Hello,

I'm running my Splunk cluster on cloud, and I'm running out of disk space. I'm planning on increasing the available disk space but I'm wondering if there might be any side effects on doing this that I should prepare for.

Since this would be done in a Production environment I need to avoid at all costs losing access to the indexed data.
I'll also perform a disk snapshot just in case.

All the indexes are set to:
maxDataSize = auto_high_volume

The steps involved would be:
1. Stop the Splunk Forwarder.
2. Stop the Splunk Indexer.
3. Perform Splunk Indexer disk Snapshot.
4. Increase the disk space on Splunk Indexer.
5. Wait for the change to be in effect.
6. Restart the Splunk Indexer.
7. Restart the consumers on the Splunk Forwarder.

Are there any other steps that I should perform?

Thanks in advance!

woodcock · ‎10-11-2018

If you are already clustered, and your cluster is in good health (all factors met) there isn't necessarily any reason to do the snapshot. If something went wrong on 1 indexer, I would just rebuild from scratch and let the CM rebuild the buckets. However, if your testing is flawed and you finish your work on all 3 indexers and then found that something was toast, it would be good to be able to restore from a snapshot. We just got done doing this in Azure on RedHat and are using volume groups. Make sure that you run df and that this command reports the correct size. In our case, although the volume was increased and being used, there was an extra command to make some disk tools aware of the space. Also, you did not say what volume you are doing (hot/cold/archive) and that may make a difference. Finally:

There is no reason for your step #1 (why stop the forwarders?); definitely DO NOT do this.
Replace existing step #1 with Put your Cluster Master into Maintenance Mode.

View solution in original post

woodcock · ‎10-11-2018

If you are already clustered, and your cluster is in good health (all factors met) there isn't necessarily any reason to do the snapshot. If something went wrong on 1 indexer, I would just rebuild from scratch and let the CM rebuild the buckets. However, if your testing is flawed and you finish your work on all 3 indexers and then found that something was toast, it would be good to be able to restore from a snapshot. We just got done doing this in Azure on RedHat and are using volume groups. Make sure that you run df and that this command reports the correct size. In our case, although the volume was increased and being used, there was an extra command to make some disk tools aware of the space. Also, you did not say what volume you are doing (hot/cold/archive) and that may make a difference. Finally:

There is no reason for your step #1 (why stop the forwarders?); definitely DO NOT do this.
Replace existing step #1 with Put your Cluster Master into Maintenance Mode.

LordLeet · ‎10-11-2018

Hello woodcock,

Thanks for your input!

Even though we planned to have a clustered architecture, we are only running 1 indexer, that is why I've mentioned that I would stop the Splunk forwarder and the consumers to guarantee that there would be no data loss during the disk Snapshot.

Being that said would you still advise to put the cluster into maintenance mode and then stopping the indexer?

We did it on our testing environment and like you said, we had to perform these commands:
growpart /dev/xvdg 2 resize2fs /dev/xvdg2

Thank you

woodcock · ‎10-11-2018

If you only have one Indexer then you probably do not have a Cluster Master so forget about Maintenance Mode (that is a CM-only thing). There is never any reason to stop a forwarder; it should queue just fine if the indexer disappears. The only thing stopping it will do is stop it from complaining about no indexers in its _inernal log.

In a Splunk cluster on Cloud, what are some best practices when Increasing indexer disk space?

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Join the Conversation

In a Splunk cluster on Cloud, what are some best practices when Increasing indexer disk space?

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Splunk MCP & Agentic AI: Machine Data Without Limits