On my Universal Forwarders, I want to have the ability to monitor and alert off when the Splunk Universal forwarder service has been stopped or modified.
Any options on how to do this?
I am already looking into basic windows event monitoring on windows services, but I didn't know if there was a Splunk related way to do this?
Possibly some Splunk app or something?
This has been solved many times including:
Meta Woot!: https://splunkbase.splunk.com/app/2949/
TrackMe: https://splunkbase.splunk.com/app/4621/,
Broken Hosts App for Splunk: https://splunkbase.splunk.com/app/3247/
Alerts for Splunk Admins ("ForwarderLevel" alerts): https://splunkbase.splunk.com/app/3796/
Splunk Security Essentials(https://docs.splunksecurityessentials.com/features/sse_data_availability/): https://splunkbase.splunk.com/app/3435/
Monitoring Console: https://docs.splunk.com/Documentation/Splunk/latest/DMC/Configureforwardermonitoring
Deployment Server: https://docs.splunk.com/Documentation/DepMon/latest/DeployDepMon/Troubleshootyourdeployment#Forwarde...
to monitor the universal forwarder use this query :
index=_internal source=*metrics.log group=tcpin_connections | eval Host=coalesce(hostname, sourcehost) | eval age = (now() - _time ) | stats min(age) as age, max(_time) as LastTime by Host | convert ctime(LastTime) as "Last Active On" | eval Status= case(age < 3600,"Running",age > 3600,"DOWN")|search Status=DOWN
This will show if any universal forwarder is down and will list out the host name and when it was connected last. (Will show if any forwarder is down for more then 1 hr)
Hello Prakash. I am thinking of this from a security perspective - if a malicious actor is on my network and started turning off my UFs how could I search and alert in a quick amount of time?
yes you can modify this part and set how often you need to check the status of forwarder :case(age < 3600,"Running",age > 3600,"DOWN") 3600 sec = 1hr , if you want to see the status in last 10 mins you can change it to 600. And using the query you can set an alert might little tune needed
How about if a user shuts down their machine and goes home for the evening. Any ideas on how to differentiate a case like that versus someone manually shutting down the UF service?