Solved: WEF - Universal Forwarder multiple Windows TA's / ...

ljo4497

Hi,

We currently have a centralized WEF collection server that collects all windows logs across the environment.
This includes forwarding sysmon,application,system channels etc... to the collector.

Everything ends up in ForwardedEvents on the WEF collection server. I've installed a UF on this host.

I have the windows TA deployed with the following input stanza

#[WinEventLog://ForwardedEvents]
#disabled = 0
#index = wef
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost

I have 2 problems currently.

The splunk universal forwarder doesn't appear to be keeping up with the number of windows event logs coming to the WEF collector. ~1000 hosts. Another (different) SIEM collector for WEF keeps up fine on the same host and collects all logs. i'm able to compare what one collector is collecting vs the Splunk UF. I've tried adjusting the batch_size and checkpoint interval as above.

I want to split certain windows channels in the ForwardedEvents channel to different indexes. I have tried deploying the microsoft sysmon TA and adding a new input with the following configuration.

#[WinEventLog://ForwardedEvents]
#disabled = true
#index = wef-sysmon
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost
#whitelist = $XmlRegex='Microsoft-Windows-Sysmon'

i then add

blacklist = $XmlRegex='Microsoft-Windows-Sysmon'

to the windows TA.

Then everything seems to stop. I stop receiving all events on my indexer.

I've also tried adding multiple inputs with differing indexes and whitelist/blacklists in the windows TA to no avail.

Would someone be able to point me in the right direction?

PickleRick

Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.

The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.

Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.

https://learn.microsoft.com/en-gb/archive/blogs/russellt/creating-custom-windows-event-forwarding-lo...

https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md

View solution in original post

PickleRick

Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.

The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.

Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.

https://learn.microsoft.com/en-gb/archive/blogs/russellt/creating-custom-windows-event-forwarding-lo...

https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md

ljo4497

Thanks @PickleRick

This was my thinking as well. We're really only doing around 1500 EPS roughly. So unsure why some messages are not making it through and others aren't.

Yeah i've looked into the links you've provided previously. the problem is getting a hold of ecmangen.exe as you have to install quite an old Win 10 SDK to access it as it's been removed from all recent SDK's.

We're running server 2022 on our WEF Collector.

MuS

Hi there,

Missing events from WEF/WEC can be cause by the file size, if too small they rotate away before the UF even has a change to read it .. don't ask how I know 😉

Increasing the size for the forwardedevents channel will help resolving this.

Hope this helps ...

cheers, MuS

PickleRick

Yup. If you start lagging behind (in our case we were about 2-2.5 hours behind during midday; we would catch up during evening-night) and Windows decides to rotate the log file, you'll end up missing events probably.

ljo4497

Thanks all,

I've split out the Forwarded events and subscriptions to be more granular. And the dedicated sysmon channel + the TA is working well.

I think we're roughly running 9 minutes behind. which isn't too bad, but i want to ensure we don't miss any logs. I'm still collecting some event IDs, but not seeing them in Splunk at all. I am seeing them in other solutions.

Can i increase the cache size of the universal forwarder itself?
I've increased the persistentCacheSize to 10GB, but unsure if i've set this property correctly or if it impacts the windows_TA

Thanks

ljo4497

When i enable

[WinEventLog]
persistentQueueSize=5GB

in the windows_ta, all event flow stops.

I see the queue file created in var/run/splunk/exec

but no events are indexed. I remove that stanza, and events flow again...

WEF - Universal Forwarder multiple Windows TA's / performance issues

universal forwarder

Windows

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation