I am currently running a 3 indexer cluster that provides 3 months hot/warm and 3 months cold retention for 120GB a day in AWS.
With that setup I found I could technically extend my cold storage to 9 months with a modest increase in disk.
My concern with doing this is that I've always heard you should expand Splunk horizontally rather than vertically.
Given that the users who would be searching over a year of data would be doing so once a month,
do you think that would put too much pressure on the cluster?
The most likely impact is that searches will be slower than before, but Splunk will continue to work. The question is whether the searches will run fast enough to avoid users complaining?
It's almost impossible to answer any further without additional information - for example how busy are the indexers now during these reporting periods? Is the CPU/Disk I/O usage very high when reports are run? If the searches are infrequent and the indexers have spare capacity, then it is unlikely to cause a major impact. Another issue is whether the users are running searches interactively or via a saved search. Interactive users are much more likely to complain about slow performance or queues, while users receiving reports via saved searches or email will unlikely be impacted by slowness.
The purpose of cold storage is for data that will be accessed very infrequently and Splunk reduces the volume of data by minimizing the indexes. This means that when you run a search on cold data, you increase the amount of CPU/Disk IO usage compared to searching hot/warm buckets. In addition, cold storage can be located on separate disks and so typically is moved to cheaper slower disk space. So you need to decide whether you want to pay for additional disk space/faster disks and reduce your compute power, or pay for additional compute power and use less disks/cheaper disks.
Personally I would try to increase my usage of each server rather than immediately add additional servers. As additional disk space or CPU resources can be added for a much lower cost than having to deploy, manage and pay for a whole server.
The most likely impact is that searches will be slower than before, but Splunk will continue to work. The question is whether the searches will run fast enough to avoid users complaining?
It's almost impossible to answer any further without additional information - for example how busy are the indexers now during these reporting periods? Is the CPU/Disk I/O usage very high when reports are run? If the searches are infrequent and the indexers have spare capacity, then it is unlikely to cause a major impact. Another issue is whether the users are running searches interactively or via a saved search. Interactive users are much more likely to complain about slow performance or queues, while users receiving reports via saved searches or email will unlikely be impacted by slowness.
The purpose of cold storage is for data that will be accessed very infrequently and Splunk reduces the volume of data by minimizing the indexes. This means that when you run a search on cold data, you increase the amount of CPU/Disk IO usage compared to searching hot/warm buckets. In addition, cold storage can be located on separate disks and so typically is moved to cheaper slower disk space. So you need to decide whether you want to pay for additional disk space/faster disks and reduce your compute power, or pay for additional compute power and use less disks/cheaper disks.
Personally I would try to increase my usage of each server rather than immediately add additional servers. As additional disk space or CPU resources can be added for a much lower cost than having to deploy, manage and pay for a whole server.
I think this does answer my question. The long term searches wouldn't be interactive and would be performed in the middle of the night once a month so I don't think the searches themselves would hose the index cluster.
My main concern was what would happen when Splunk rolls the data from hot/warm to cold since there would now be a much larger disk to search for open space. Based on your answer it doesn't sound like that would necessarily be the main factor to worry about.
I think I will try increasing the disk space and then increase the retention a month at a time to see if it becomes noticeable during the day.
Thanks.