Hello team!
We have a problem with sending data from several Domain Controllers to our splunk instance. We are collecting and sending logs by using Splunk Universal Forwarder. Often there are no problems with sending, but sometimes we are "losing" logs. DCs have very big load of data, and when the amount of logs reach the top they are start to overwriting oldest logs. As I noticed, that problem affected only at Windows Security journals.
Problem: Splunk Universal Forwarder doesn't has time to get and send data from DC before logs got been overwritten. Could this be related to the Splunk process execution priority or load from other processes at DC?
How to solve this problem? Do you have the same experience or advices to rid this problem?
You can increase the speed (throughput) of your Universal Forwarder, but this will also use more network bandwidth. To do this, change the maxKBps setting in the limits.conf file on the Universal Forwarder.
More details are here:
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Limitsconf
I also recommend following this :
It explains how to safely increase bandwidth and check its usage.
You can monitor the bandwidth usage by checking the Universal Forwarder logs on the machine at:
$SPLUNK_HOME/var/log/splunk/metrics.log
Or use this Splunk search to see the data:
index=_internal source="*metrics.log" group=thruput
Try increasing the maxKBps setting in the forwarder's limits.conf file. It sounds like the default of 256 is not enough. Try 512 or 0 (unlimited).