All Apps and Add-ons

Windows TA DNS log parsing bad regex

bearda
Engager

I'm using the Splunk Add-on for Microsoft Windows to parse logs from a couple Windows 2019 DNS servers. Things seemed to be working OK, but we noticed some weird behavior with the src_domain field whenever the domain being resolved contained a dash. The domain name would truncate there, so instead of 

(31)maya-apiserver-867994d7dd-tthh9(0) -> maya-apiserver-867994d7dd-tthh9

we would instead get:

(31)maya-apiserver-867994d7dd-tthh9(0) -> maya

This has caused some issues with our ESXi hosts, as they're all named something like vhost-01 which all got truncated down to the same vhost. We traced this to a regex being used in the Windows TA that looks wrong:

REGEX = (\(\d\)*[\w+\(\d\)]{1,})

This is trying to match a domain name like:

(9)pod01-id1(4)eus2(6)backup(12)windowsazure(3)com(0)

I have to admit that I'm fairly confused by the syntax since it seems like an inappropriate use of a character class, but from what I can see the use of w is restricting domain characters to word character, which do not include the dash. In addition it's allowing underscores, which aren't valid in domains. We replaced it with this regex and and now getting much better results:

REGEX = (\(\d+\)(?:[a-zA-Z0-9\-]+\(\d+\)){1,})

Is there any way this change can be evaluated for the next rev of the Windows TA?

Labels (1)
0 Karma
Get Updates on the Splunk Community!

AppDynamics Summer Webinars

This summer, our mighty AppDynamics team is cooking up some delicious content on YouTube Live to satiate your ...

SOCin’ it to you at Splunk University

Splunk University is expanding its instructor-led learning portfolio with dedicated Security tracks at .conf25 ...

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Organizations handling credit card transactions know that PCI DSS compliance is both critical and complex. The ...