Hello Splunkers,
Appreciate if anyone can help me here, I'm after a Best practices guide/ article for Windows Server Logs, specially when it comes to Windows AD&DC monitoring, The default Windows TA does collect a lot of logs , but I would like to look into some advance use-cases options , Tuning configs, GUID/SID translation configurations (as this was required when I was using another log management solutions).
I look forward to hear back from the experienced Splunkers.
Thanks,
Regards,
Moh.
Hi Moh,
For advanced Windows Server and Active Directory/DC monitoring using Splunk, here are some best practices and resources to consider:
Hope this helps! Feel free to ask if you need specifics on any of these areas.
If this helped, some karma would be appreciated!
It's a typical "it depends" question. What you ingest depends heavily on your use cases - what you want to use the data for. How you do it depends on the ingestion method and scale of your environment. If you want to use your data for security monitoring, you'll be probably ingesting mostly Security event log channel. If you want to track performance, you will want - surprise, surprise - perfmon data. There is also an admon input...
The only general rule of thumb is - use the best method you can. There are several methods of ingesting event log data, some of them apply to perfmon as well.
In order of decreasing preference from Splunk's point of view:
1. Install UFs on each DC and ingest data from local sources. This is the method that gives you the most control as well as the best performance. The downside is that you must maintain all those UFs and the security team might not be very happy about installing additional software on DCs.
2. Windows Event Forwarding - you set up WEF subscriptions from several DCs to one or more WEF collectors and from the collector(s) you ingest the forwarded data. Much fewer UFs to maintain but more hassle to configure WEF (and it's out of your hands since this has to be done by Windows/AD admins), there are possible performance issues over a certain threshold of events per second (technically the same problem can occur with method 1 but is way less likely since you're dealing with one source), there is additional latency (which can be significant with WEF in pull mode). Still, it's a pretty reasonable choice if you can't install UFs everywhere.
3. WMI - pulling data from remote computer using WMI sessions initiated from the comouter you have your UF installed on - this method does exist and is indeed used sometimes in some peculiar circumstances but is generally strongly advised against. WMI is a nightmare to manage and troubleshoot. And performance is highly sub-par. Avoid if possible - use method 2.
4. Use third-party solution (solarwinds, nxlog, fluentbit...) which reads the data from windows sources and sends to Splunk by means of syslog or writing to a file accessible over the network or other similar setups. Theoretically you could do that but it doesn't mean you should. Because you shouldn't. Don't even think about it. Apart from the fact that you're relying on another piece of software beyond your control, which can be difficult to manage/monitor/troubleshoot, the huge issue is that most of those third-party solutions will give you events in a format completely differently from what Splunk (and Windows TA) is expecting and can deal with. So if you wanted to do anything reasonable with data ingested this way you'd have to invest huuuge amount of time into handling the data so that it parses properly. Don't do it this way. Just don't.