Hi, currently have a client with a gap of 2 hours in their data ingestion due to patching of their core components, likely stopping ingestion all together during the patching period.
They have a heavy forwarder (which was down) that receives the inputs from the forwarders, 9 in question. Wondering what's the best form of action to aid them in filling the gap, and minimizing he damage done.
The solution depends on the type of data source and whether the data is still available.
Here are the main scenarios:
To minimize the risk of data loss during downtime another possible solution is:
Enable persistent queues on both UF and HF (disk-based buffer instead of just in-memory)
Configure useACK = true to ensure reliable delivery
So first verify what type of input you have and whether the data is still physically available, then proceed accordingly.
Yup. This pretty much covers it. Just be aware that if/when the inputs already processed some data which they might be able to re-get or re-read, it might be very difficult to get just those missing events without reingesting a whole lot of other data. Resetting checkpoint will usually do just that - restart from scratch.
Hi @Emersion
I think the best thing to do here would be to have a second HF and have the upstream forwarders balance the output between the two HFs (single output group with 2 servers) - that way they can patch one at a time and ensure that data is still received in a timely manner.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing