Sorry in advance this is such a long post so I'll try describing this in a sentence or two in case this is so easy you don't need to read the short novel I wrote below it to figure this out.
Q. I need Splunk to help me figure out "what changed" by ingesting all Windows files, directories, and registry keys, and then have Splunk compare and trigger notifications from that indexed data the next time that data is ingested, in either a query, dashboard, ITSI service or ???? An example would be, someone changed a hosts file and "broke the internet" but of course there are no change-control records stating any work was done the previous night.
This may go a bit beyond what Splunk was designed for but I've also learned that Splunk can do about anything you can dream up. Here's a few use-cases why I need this (and so do you).
USE CASES
Use-case 1: I need to track changes and difference between large clusters of application servers since change-control is something my organization doesn't believe in. Here are some things I've found that were different that caused outages. Differences in hosts files, no hosts files, differences in application or OS patching, differences in registry hives, configuration files, .DLL/file versions, file sizes, you get the idea.
Use-case 2: Once we can figure out use-case 1, I'd like to setup triggers so that when specific changes are detected that we define, Splunk sends out an alert/email to notify some poor guy in our NOC. For example, someone changes a hosts file or patches one of our Prod servers, notify someone.
What I've Tried
In the past I've used Powershell DSC to automate "fixing" when changes happen but this is a slightly larger hammer than I want since often times as in the case of patching, we want those changes. But when something is broken and the inevitable "what changed" comes up, nobody knows, or they don't know all the files, registry updates, etc. that happened.
I've also looked at the Security Event logs which will list what files have changed or updated, and who made the changes, but if you install a patch for example, this isn't going to tell you exactly all the files that were changed, which reg keys were updated, etc.
I've used SMS and Zenworks back in the 90's which worked awesome to identify "what changed" but those tools don't really work on modern systems since, just sitting idle, your system makes about a million updates and changes which wasn't always the case.
SysInternals Process Explorer is a great tool but if you use the snapshot feature to see "what changed" without even installing or changing a single file, thousands of files and reg keys are updated and the longer you wait, the more changes take place.
What I'm Doing Today
So what I currently do is basically run some basic "tree" and "dir" commands and pipe the output to a .txt file with variables that name the file after the server it was run on, time stamp it, and copy the output to a NAS share with all the other files. Then, when there's an issue, I use WinMerge to ingest 2 or 3 files from suspicious systems, and it will automatically highlight the differences in my output files, and very slowly and manually, I find "what changed" from the night before and can solve the mystery usually much faster than anyone else not using this approach.
What I'm Hoping is Possible
If you're still following my ranting question here's what I'm hoping is possible with Splunk to make this either way easier, less manual, or hopefully plain old automated in a query, dashboard, or ITSI thingy.
I'd like to think since we're capturing perfmon, IIS logs, and the normal event logs (and whatever comes with your standard Windows TA), that our Splunk Indexes somehow already have a complete listing of system files and directories indexed (registry files would be a bonus). Then, with that data, run some sort of sysdiff on the data each morning, notify me something changed, and output the differences (or at least the file/directory that changed) so some location we can use for further investigation.
Then once that's working, set triggers on certain files so that proactively, when a certain file, app, or regkey is changed, I get an email before it's even reported as a problem.
I know that's a lot but I also know it's possible and at the very least be scripted, I'm just too new to Splunk and to dim to have the skillset needed to make this happen. And if there's some totally better way of doing this outside of Splunk, I'm fine buying or learning another product and having Splunk call/index that other "thing" to get our notifications out there but really trying to keep Splunk as my central hub for all our notifications one way or another. Thanks for the time!
... View more