I have been looking at network tools such as PTRG, Zabbix, etc. to do weekly reports on Windows servers and a few in house Apps. None of them can do what I want without some heavy customizations.
I found Splunk by chance. I have already installed Splunk Light and have couple Windows servers forward Application and System Events as well as text logs from our Apps. Already I can I see the possibilities.
I am wondering if anyone can provide me with some answers:
As mentioned before, the goal right now is to do weekly reporting and not necessarily active monitoring.
What I want to report on:
These online docs should help answer your questions. The short answer is yes, Splunk can handle the scenarios you describe. Splunk has a very robust list of SPL commands to transforms timeseries data into meaning/actionable reports/dashboards.
This link explains how Splunk handles log rotation
I have this: :
sourcetype="Perfmon:CPU Load" counter="% Processor Time" earliest="-7d@d"| bucket _time span=20s | stats avg(Value) as avgCPU by _time | where avgCPU >80
Seems to work, had to lower the avgCPU to test.
Reading your link though I have yet to find a way to include the top processes whenever the CPU goes above the threshold. I don't think perfmon keeps that data.
I was hoping the Universal Forwarder, seeing it had the option to monitor CPU on setup (I suppose using Perfmon...), maybe has the ability to run scripts on a threshold and append the output of the scripts into the events sent to Splunk.
I mean it is a lot of documentation, and I was admittedly skimming based on the names of those commands . Did I miss anything? Can I get some direction.
That looks about right but I can't seem to make it work.
On each servers's universal forwarder I added this to \etc\system\local\wmi.conf:
[WMI:SessionProcess] interval = 10 disabled = 0 index = perfmon_index wql = Select ProcessId, SessionId From Win32_Process
And to \etc\system\local\inputs conf
[perfmon://Process] interval = 10 object = Process counters = % Processor Time; ID Process; Working Set - Private; IO Read Operations/sec; IO Write Operations/sec instances = * index= perfmon_index disabled = 0 mode = multikv
And restarted the UniversalForwarder service. It doesn't seem to forward any processes.
Also in the samples inside the default wmi.conf and index.conf, the index is usually "perfmon" and not "perfmon_index". Which one should it be?