Hi all,
I have migrated a 9.0.4 HF from a Windows Server 2012 to a Window server 2022. The original connector was working fine, while the new one (with the same settings) keeps crashing. This is the error I got almors every minute on Application event viewer:
Faulting application name: splunk-winevtlog.exe, version: 2305.256.25832.56887, time stamp: 0x64e8dfcc
Faulting module name: ntdll.dll, version: 10.0.20348.1970, time stamp: 0x31881ea2
Exception code: 0xc0000374
Fault offset: 0x0000000000104909
Faulting process id: 0x1304
Faulting application start time: 0x01d9ed2bd5be870c
Faulting application path: C:\Program Files\Splunk\bin\splunk-winevtlog.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
Report Id: 45c2b6fd-2c6e-484d-9602-eb948052101d
Faulting package full name:
Faulting package-relative application ID:
I tried to upgrade the HF to version 9.0.6 and then to version 9.1.1 but the error persist.
It seems to be caused by the inputs configured on Splunk_TA_windows (version 8.7.0 installed). This is the enabled inputs that cause the issue:
[WinEventLog://Security]
disabled = 0
start_from = oldest
current_only = 0
evt_resolve_ad_obj = 1
checkpointInterval = 5
blacklist1 = EventCode="4662" Message="Object Type:(?!\s*groupPolicyContainer)"
blacklist2 = EventCode="566" Message="Object Type:(?!\s*groupPolicyContainer)"
blacklist3 = 4656,4658,4690,5031,5140,5150,5151,5154,5155,5156,5157,5158,5159
renderXml = false
index = wineventlog
###### Forwarded WinEventLogs (WEF) ######
[WinEventLog://ForwardedEvents]
disabled = 0
start_from = oldest
current_only = 0
checkpointInterval = 5
## The addon supports only XML format for the collection of WinEventLogs using WEF, hence do not change the below renderXml parameter to false.
renderXml = true
host = WinEventLogForwardHost
index = wineventlog
The only solution I found is to disable the ForwardedEvents input. This way the HF works as expected. I also tried to set current_only=1 on that input with no luck.
Does anyone knows if it's a know issue and how to troubleshoot this?
Regards
Alessandro
One thing - 9.1 introduced the wec_event_format parameter for windows event inputs. It can cause your events to not be ingested at all if misconfigured but maybe it can cause other problems. You can fiddle with forwarded events format in subscription setting and adjust this parameter accordingly.
It's an interesting thought, though the same issue is occuring on 9.0.1 for me but on Server2022
Anecdotal but I found a few other log shoveling vendors appeared to have similar issues with the Forwarded log and Server 2022. Agent crashing/restarting constantly, but they seem to have patched their problems already.
[Winlogbeat] Repeated warnings · Issue #36020 · elastic/beats (github.com)
Interesting at least.
Interesting, splunk support hasn't had any luck with my case yet. We've been attempting different things but no luck. I may throw in the towel and downgrade to 2019
According to Splunk on our case, version 9.2.2 will have a fix for this and it'll be released on 5/24.
They also have a custom build available that'll solve it were going to try next week.
Did you managed to try version 9.2.2 they provided? They also gave it to me but in my case it's not working. Now I don't have crashes but the splunk-winevtlog process keeps to move in "suspended" state in task manager. Actually almost nothing is collected from Forwarded Events...
Regards
Unfortunately, support has been slow to get us a patched version or 9.2.2 ahead of it's general release. They said that 9.2.2 release was being pushed back. I dunno, still in a holding pattern on my end.
That's concerning to hear that it didn't work for you... the "suspended behavior" is what i'm seeing on the existing 9.2.1 version.
So basically we moved from crashes (9.1) to process suspended (9.2)...I would prefer the first, at least something was collected. Thanks a lot for the feedback.
Regards
Haha yeah I guess so, fwiw I see both on ours... Some processes are suspended when they crash others just crash and vanish from task manager.
Truthfully i'm not sure what the difference is between that behavior is.
No, i downgraded my operating system to Server 2019 and everything started working. Ran into a different issue afterwards unfortunately. Too much data was being forwarded and had to start using a heavy forwarder.
Would of been nice for Splunk support to mention this. I've had to move on and decommission Server 2022. Installed 2019 like you suggested and everything is working as it should. Thanks again.
Well.... I appreciate you helping me confirm it's just 2022 😄
Yep Server 2022 was the only outlier for us. The issue was consistent across a few 9.x UF versions as well. 9.01, 9.1.0 and 9.2.1
All had the same behavior on Server 2022 but not older win server platforms. Honestly if my infrastructure wasn't already up and running on 2022 I'd downgrade to 2019.
Really? Only 2022. I may downgrade if that's the case. I have a support ticket working with splunk and so far no luck or mention of version conflict. I may downgrade and test.
Can we bump this? I'm running into same issue.
A temporary workaround that worked for us was setting current_only to 1 and restarting the forwarder....
Splunk-wineventlog.exe still crashes and restarts, but it does at least read some events and send them before it does.
I find it strange that the other Event Logs forward just fine and not crash. It's just when forwarding the "forwarded events". We can't be the only people using windows even collectors to collect events and then forward them to splunk server.
Yeah, we have 14 servers acting as our WEF environment all with the same UF version and conf pushed out from central management/deployment. There are 6 that are Server 2016, 4 are Server 2019, and another 4 are Server 2022.
Only the Server 2022 boxes have this issue.
I've messed around with various .conf settings trying to bandaid it and only "current_only = 1" seems to make a difference
I've packed up procmon pml and .dmp files for support to look at... dunno if there's a fix possible.... I'll post back if I hear anything.