.
It sounds like the UF might be hitting a resource bottleneck (CPU, memory, disk I/O, or handles) or the Windows Event Log channels may be overwhelmed. If the UF is forwarding to an indexer, intermittent network issues could also create backpressure and stall inputs.
I recommend checking $SPLUNK_HOME/var/log/splunk/splunkd.log for any warnings/errors around the time the data stops, this usually gives good clues on whether it’s resource, input, or connectivity related.
Hi @Priya70,without seeing the actual splunkd.log entries during the stall periods, its hard to answer. However, based on your symptoms, the most likely cause is backpressure.
Why backpressure fits your pattern:
- High-volume classic logs (Application/Security/System) pause first
- Lower-volume custom channels (Cisco VPN) continue uninterrupted
- Multiple input types affected simultaneously (monitor, registry, scripted)
- Automatic recovery after queues drain
To confirm, check splunkd.log during stall periods for:
- "queue is full" messages
- TCP connection errors to indexers
- Network timeout warnings
Other possibilities to rule out:
- Windows Event Log API resource exhaustion
- UF memory pressure
- Windows Event Log service issues
index=_internal host=<UF> source=*metrics.log* OR source=*splunkd.log* tcpout
Hope this helps narrow it down!
.
But if your data is destined for both output groups, if one group blocks, the other one blocks as well.
.