Hi,
We currently have a centralized WEF collection server that collects all windows logs across the environment.
This includes forwarding sysmon,application,system channels etc... to the collector.
Everything ends up in ForwardedEvents on the WEF collection server. I've installed a UF on this host.
I have the windows TA deployed with the following input stanza
#[WinEventLog://ForwardedEvents]
#disabled = 0
#index = wef
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost
I have 2 problems currently.
#[WinEventLog://ForwardedEvents]
#disabled = true
#index = wef-sysmon
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost
#whitelist = $XmlRegex='Microsoft-Windows-Sysmon'
i then add
blacklist = $XmlRegex='Microsoft-Windows-Sysmon'
to the windows TA.
Then everything seems to stop. I stop receiving all events on my indexer.
I've also tried adding multiple inputs with differing indexes and whitelist/blacklists in the windows TA to no avail.
Would someone be able to point me in the right direction?
Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.
The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.
Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.
https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md
Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.
The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.
Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.
https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md
Thanks @PickleRick
This was my thinking as well. We're really only doing around 1500 EPS roughly. So unsure why some messages are not making it through and others aren't.
Yeah i've looked into the links you've provided previously. the problem is getting a hold of ecmangen.exe as you have to install quite an old Win 10 SDK to access it as it's been removed from all recent SDK's.
We're running server 2022 on our WEF Collector.
Hi there,
Missing events from WEF/WEC can be cause by the file size, if too small they rotate away before the UF even has a change to read it .. don't ask how I know 😉
Increasing the size for the forwardedevents channel will help resolving this.
Hope this helps ...
cheers, MuS
Yup. If you start lagging behind (in our case we were about 2-2.5 hours behind during midday; we would catch up during evening-night) and Windows decides to rotate the log file, you'll end up missing events probably.
Thanks all,
I've split out the Forwarded events and subscriptions to be more granular. And the dedicated sysmon channel + the TA is working well.
I think we're roughly running 9 minutes behind. which isn't too bad, but i want to ensure we don't miss any logs. I'm still collecting some event IDs, but not seeing them in Splunk at all. I am seeing them in other solutions.
Can i increase the cache size of the universal forwarder itself?
I've increased the persistentCacheSize to 10GB, but unsure if i've set this property correctly or if it impacts the windows_TA
Thanks
When i enable
[WinEventLog] persistentQueueSize=5GB
in the windows_ta, all event flow stops.
I see the queue file created in var/run/splunk/exec
but no events are indexed. I remove that stanza, and events flow again...