We are currently trying to recover some lost data on one indexer. The manual recovery process is long enough that we are evaluating automated or scripted alternatives.
This data is available on another indexer on a different instance and the indexers are not clustered. They are two different indexers.
Since the indexers are completely different instances the buckets are not identical we cant use rsync or manual copy of the buckets to recover this data.
How would you sync different indexers and copy/collect only the data that is missing on the destination?
What do you mean by "what is missing"? Since they are not clustered, the buckets should be unique content-wise. So your best bet would be to copy over the buckets from the source indexer but you have to rename them to make sure you don't cause collisions.
Hi, thanks for the reply.
We have two indexers, they should have the same data for some indexes as we are forwarding this data to both destinations.
One of them had a problem and we noticed it lost some data (lets call it A), and we also noticed the other indexer has the lost data (lets call it B). The situation now is that B has some data that should be on A but it is not, and we want to fill the gaps.
We are trying to figure out how should we approach the copy of the data from the indexer that has the data to the other one that lost the data.
It's not so easy. Data in Splunk is stored in buckets. With some care you can move whole buckets but that's it. The issue here is that even if you had the same data sent to two different non-clustered indexers, the buckets receiving this data will be different. And you can't "copy-paste" data from/into a bucket.
Your only chance to get a subset of your indexed data would be to export the data from one indexer (probably using some clever scripting and searching for _raw events) and ingest it into the other one.
Two huge caveats:
1. Reingesting the data will cost you license usage.
2. There is no way (other than exporting the already indexed data from the other indexer and trying to do it manually) to deduplicate the events.
It's exactly this what you need to do.
Is there any real reason (of course needed hw/virtual/storage capacity) which are needed that you haven't have indexer cluster in place? In long run indexer cluster is something which helps you to avoid this situation.
Or have you already found what is the root cause to lost those events?