We have configured a universal forwarder on 4 Domain Controllers in our environment.
Now, we receive security events in real time on 3 Domain Controllers. The 4th DC has a lag of around 20 minutes to appear.
I am wondering if anyone has come across this issue or is there any configuration which I might have missed out.
Thanks,
One of the issues I recently dealt with was the delay in sending security channel logs in Active Directory, which I finally resolved after a few days.
Here are the steps I took to fix the problem:
I investigated the queue issue in different pipelines.
This link explains in detail how to identify and fix queue problems to reduce delays:
index=_internal host=* blocked=true
This way, you can check whether the issue is with the universal forwarder, the heavy forwarder, or a higher tier.
I experienced this issue with both UF and HF. I increased the queue size and added the following parameter along with the queue adjustment:
/etc/system/local/server.conf
parallelIngestionPipelines=2
https://conf.splunk.com/files/2019/slides/FN1570.pdf
To adjust the max speed rate in the ingestion pipeline, I modified the following parameter in limits.conf:
[thruput]
maxKBps = 0
The final and most effective step was changing the following parameter in UF’s inputs.conf:
use_old_eventlog_api=true
If you have added the parameter evt_resolve_ad_obj=true to translate SID/GUID and it cannot perform the translation, it will pass the task to the next domain controller. It waits for a response before proceeding, which can cause delays. To fix this, I added:
evt_dc_name=localhost
By implementing the above steps, logs were successfully received and indexed in real-time.
Thank you for taking the time to read this. I hope it helps you resolve similar issues.
 
		
		
		
		
		
	
			
		
		
			
					
		HI,
causes & solutions could be multiple
check that you are not limiting bandwith (maxkbps=0 or set a value) (see https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Troubleshootingeventsindexingdel... ) 
also make sure you have evt_resolve_ad_obj = 0 in the input
+ if ever you have some kind of AV software running on the server, make sure you have followed the doc about exclusion of files AND processes for splunk
+ use a recent version of UF and Splunk_TA_windows
there could also be ressources issues on the AD server (ie be at the limit of what the server can log)
I usually start with the following to see the indexing time delay (if any) -
<base search> 
| eval diff= _indextime - _time 
| eval diff = diff/60
| table _time diff
Thanks, I have been monitoring for couple of hours and see the time difference hovering between 18-28 minutes.
If you just configured the Windows Log collection with the TA, it might be possible (depending on your configurations in inputs.conf) that the Windows TA starts indexing from the oldest Windows Events.
e.g. if your inputs.conf includes:
start_from = oldest
current_only = 0
Windows Event Logs can be very large, so it might take some time to index all the old log files. In your case I would just wait for one or two days and that check the latency again. If this is not the problem and you also have problems with other logs latency, it can be also problems with the hardware references:  https://docs.splunk.com/Documentation/Splunk/7.2.4/Capacity/Referencehardware but I can only suggest from far. Hope this helps!
