Getting Data In

WEF - Universal Forwarder multiple Windows TA's / performance issues

ljo4497
Explorer

Hi, 

We currently have a centralized WEF collection server that collects all windows logs across the environment.
This includes forwarding sysmon,application,system channels etc... to the collector.

Everything ends up in ForwardedEvents on the WEF collection server. I've installed a UF on this host. 

I have the windows TA deployed with the following input stanza

 

 

 

#[WinEventLog://ForwardedEvents]
#disabled = 0
#index = wef
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost

 

 

 



I have 2 problems currently. 

  • The splunk universal forwarder doesn't appear to be keeping up with the number of windows event logs coming to the WEF collector. ~1000 hosts. Another (different) SIEM collector for WEF keeps up fine on the same host and collects all logs. i'm able to compare what one collector is collecting vs the Splunk UF. I've tried adjusting the batch_size and checkpoint interval as above.

 

  • I want to split certain windows channels in the ForwardedEvents channel to different indexes. I have tried deploying the microsoft sysmon TA and adding a new input with the following configuration.

 

 

 

#[WinEventLog://ForwardedEvents]
#disabled = true
#index = wef-sysmon
#start_from = oldest
#current_only = 0
#batch_size = 50
#checkpointInterval = 15
#renderXml=true
#host=WinEventLogForwardHost
#whitelist = $XmlRegex='Microsoft-Windows-Sysmon'​

 

 

 

i then add 

blacklist = $XmlRegex='Microsoft-Windows-Sysmon'

to the windows TA.

Then everything seems to stop. I stop receiving all events on my indexer.

I've also tried adding multiple inputs with differing indexes and whitelist/blacklists in the windows TA to no avail.

Would someone be able to point me in the right direction?

 

 

 

Labels (2)
Tags (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.

The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.

Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.

https://learn.microsoft.com/en-gb/archive/blogs/russellt/creating-custom-windows-event-forwarding-lo...

https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

Wineventlog inputs have been known for having performance problems above certain EPS threshold. It usually doesn't manifest itself in local events ingestion but shows when pulling WEF-ed logs. Adding additional pipelines doesn't help.

The way around it (other than setting up more WEC hosts and splitting WEF subscriptions among them is to create more eventlog channels and split your subscription into several channels. The performance problems for eventlog inputs seem to be at single input level so if you're getting stuck around 10k EPS with single input you should be able to get up to 40k EPS if you split your ForwardedLogs into 4 channels.

Unfortunately, it's a bit of work to set it up and you need to create custom dll for that.

https://learn.microsoft.com/en-gb/archive/blogs/russellt/creating-custom-windows-event-forwarding-lo...

https://github.com/palantir/windows-event-forwarding/blob/master/windows-event-channels/README.md

 

ljo4497
Explorer

Thanks @PickleRick 

This was my thinking as well. We're really only doing around 1500 EPS roughly. So unsure why some messages are not making it through and others aren't.

Yeah i've looked into the links you've provided previously. the problem is getting a hold of ecmangen.exe as you have to install quite an old Win 10 SDK to access it as it's been removed from all recent SDK's.

We're running server 2022 on our WEF Collector.


0 Karma

MuS
Legend

Hi there,

Missing events from WEF/WEC can be cause by the file size, if too small they rotate away before the UF even has a change to read it .. don't ask how I know 😉 

Increasing the size for the forwardedevents channel will help resolving this.

Hope this helps ...

cheers, MuS

PickleRick
SplunkTrust
SplunkTrust

Yup. If you start lagging behind (in our case we were about 2-2.5 hours behind during midday; we would catch up during evening-night) and Windows decides to rotate the log file, you'll end up missing events probably.

0 Karma

ljo4497
Explorer

Thanks all, 

 

I've split out the Forwarded events and subscriptions to be more granular. And the dedicated sysmon channel + the TA is working well.

I think we're roughly running 9 minutes behind. which isn't too bad, but i want to ensure we don't miss any logs. I'm still collecting some event IDs, but not seeing them in Splunk at all. I am seeing them in other solutions.

Can i increase the cache size of the universal forwarder itself?
I've increased the persistentCacheSize to 10GB, but unsure if i've set this property correctly or if it impacts the windows_TA

Thanks

0 Karma

ljo4497
Explorer

When i enable

[WinEventLog]
persistentQueueSize=5GB

 
in the windows_ta, all event flow stops.

I see the queue file created in var/run/splunk/exec

but no events are indexed. I remove that stanza, and events flow again...

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...