Hi,
thanks to the wonderful website_monitoring app, I see some interesting but unexplained tidbits.
We have two indexers with HEC configurued. Because of project delays those HEC inputs are idle.
I use
https://splunk-index1:8088/services/collector/health
for the query in website_monitoring.
And at least onece a day I do get a 5 second response time on one of the indexers, not the other. Usually this is less than 20ms.
Checking _index/_audit for anything happening in parallel, I found nothing so far that would explain this monster increase.
It is not linked to specific times.
If I only use the port, the peak times are just up t0 60ms worst case. But that gives me an ugly 404 error, so I figured I might as well use a decent endpoint.
Any ideas?
thx
afx
Not a direct answer to your question, however:
Its best practice NOT to run HEC on indexers.
Ideally you would install HeavyForwarders and run the HEC collection endpoints from there.
Whilst it does not directly answer your question, it would mitigate the impact of a slow responding indexer (if indeed that is the problem) by separating the realtime collection(HEC) response times from the ingestion lag (indexers)
Currently our Infrastructure is small, so I try to not involve yet another box.
The funny thing is, the machine is pretty much idle when this happens.
cheers
afx