Alerting

What's the best way to create an alert to tell whether a windows server is shutdown or down ?

Vishal2
Explorer

Can you provide an example of a search query or script I can use to tell if a windows server is shutdown or down.i am looking for the best way to set up an shutdown or down status alert for windows server.

Labels (1)
0 Karma
1 Solution

venkatasri
SplunkTrust
SplunkTrust

Hi @Vishal2 

May be this would work based on Windows EventCodes description.  Assuming you have Windows add-on running and indexing the WinEventLogs from these windows servers that you want to find when they shutdown.

 

index=<your_index> source=WinEventLog* EventCode=41 OR	EventCode=1074 OR EventCode=6006 OR EventCode=6008 
| stats count by host 
| where count > 1

 

 

Event IDDescription
41The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
1074Logged when an app (ex: Windows Update) causes the system to restart, or when a user initiates a restart or shutdown.
6006Logged as a clean shutdown. It gives the message "The Event log service was stopped".
6008Logged as a dirty shutdown. It gives the message "The previous system shutdown at time on date was unexpected".

 

---

An upvote would be appreciated and Accept solution if this reply helps!

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2,

the best approach is to create a lookup (called e.g. perimeter.csv) containing all the hosts to monitor.

Then you could run a simple search like this:

| metasearch index=_internal
| eval host=lower(host)
| stats count BY host
| append [ | inputlookup perimeter.csv | eval host=lower(host), count=0 | fields host count ]
| stats sum(count) AS total BY host
| where total=0

using this search (without the last row) you can also create a dashboard displaying the status of all the monitored hosts:

| metasearch index=_internal
| eval host=lower(host)
| stats count BY host
| append [ | inputlookup perimeter.csv | eval host=lower(host), count=0 | fields host count ]
| stats sum(count) AS total BY host
| eval status=if(total=0,"Down","Up")
| table host status

Ciao.

Giuseppe

0 Karma

venkatasri
SplunkTrust
SplunkTrust

Hi @Vishal2 

May be this would work based on Windows EventCodes description.  Assuming you have Windows add-on running and indexing the WinEventLogs from these windows servers that you want to find when they shutdown.

 

index=<your_index> source=WinEventLog* EventCode=41 OR	EventCode=1074 OR EventCode=6006 OR EventCode=6008 
| stats count by host 
| where count > 1

 

 

Event IDDescription
41The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
1074Logged when an app (ex: Windows Update) causes the system to restart, or when a user initiates a restart or shutdown.
6006Logged as a clean shutdown. It gives the message "The Event log service was stopped".
6008Logged as a dirty shutdown. It gives the message "The previous system shutdown at time on date was unexpected".

 

---

An upvote would be appreciated and Accept solution if this reply helps!

0 Karma

Vishal2
Explorer

Hi @venkatasri, @gcusello 

By using event codes it's working but can please post by using lookup containing the all the hosts to monitor. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2 ,

you already have the solution: you have to create a lookup containing the list of monitored hosts and run the above search, what's your doubt?

Ciao.

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

0 Karma

Vishal2
Explorer

Hi @venkatasri 

Can you please post the query for Linux server  as well ....

0 Karma

I-C-U
New Member

Power Shell for linux?

cat filename | grep or awk?

0 Karma

venkatasri
SplunkTrust
SplunkTrust

@Vishal2  As @gcusello  answered without forwarder installed on Linux it's not possible. You have the solution already. As your original query was already answered regarding windows you could Accept the solution and open a new post for someone to answer related to linux.

0 Karma

Vishal2
Explorer

Hi @venkatasri 

Windows query is under testing once it's successful I accept the solution & coming to Linux, Forwader is already installed and data is reporting to splunk.

0 Karma

venkatasri
SplunkTrust
SplunkTrust

@Vishal2 You have the solution from @gcusello Already for Linux/Windows which is a lookup based. Hope that helps!

0 Karma

Vishal2
Explorer

Is it used for linux servers ??

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2,

it's not relevant the kind of server because it uses the Splunk Forwarder's logs.

In this way you're sure that if the server is up and the Forwarder is running you have logs.

Ciao.

Giuseppe

0 Karma

Vishal2
Explorer

Hi

Thanks for reply

Please can you provide exact querys for alert creation if windows and Linux servers shutdown. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2,

as I said, it isn't relevant if the Operative System is Windows or Linux, it's only relevant the list of hosts to monitor that you put in the perimeter.csv lookup.

If you put only windows servers, you'll monitor only windows servers!

if you like (but I think that's unuseful) you can create two perimeters files (called e.g. win_perimeter.csv and x_perimeter.csv to separately monitor winsows and Linux server and create two different alerts, one for windows and one for Linux, but I don't like this.

If you like you can also insert in the perimeter.csv file the information about the Operative System (so you'll have in this lookup two fields "host", "os") and display it in the alert search:

| metasearch index=_internal
| eval host=lower(host)
| stats count BY host
| append [ | inputlookup perimeter.csv | eval host=lower(host), count=0 | fields host os count ]
| stats sum(count) AS total values(os) AS os BY host
| where total=0
| table host os

Ciao.

Giuseppe

0 Karma

Vishal2
Explorer

if I have only one Linux host I'd then   what is the search query for shutdown or down or up to alert creation

0 Karma

gcusello
SplunkTrust
SplunkTrust

@Vishal2,

as I said, the operative systems isn't relevant, so if you have to monitor many windows servers and one Linux server, you can add also a Linux hostname in the perimeter.csv lookup containing all the windows servers.

If instead you have only one server to monitor, you can use a simpler search:

index=_internal host=my_hostname

My hint is to build a complete control using my previous answer and the perimeter.csv, so, when you'll have more servers to monitor, you're already ready.

Ciao.

Giuseppe

0 Karma

Vishal2
Explorer

Hi,

Can you post that alert settings...?

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2,

settings I use in this alert (that I configure in every installation I do!) are:

  • search is the one in my previous answers,
  • Alert Type: Scheduled
  • Time Ranhe depends on your requirements, I usually use 5 minutes (300 seconds).
  • Frequency, depends on the time frafe and it's every 5 minutes, I usually use this cron expression:
    • */5 * * * *
  • Expires: 24 hours
  • Trigger Alert when: Number of results=0
  • Trigger: Once
  • Throttle: depends on your reaction time, e.g. 1 hour
  • Add Actions:
    • Add to Triggered Alerts with High Severity
    • email or a script that opens a case on your troubleticketing system

Ciao.

Giuseppe

 

0 Karma

Vishal2
Explorer

Hi,

 

you are talking about the logs monitoring but if universal forwarer is failed that time logs not coming to splunk, I don't need that, I need server shutdown or down related query

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Vishal2,

if Forwarder isn't sending logs, you cannot monitor your host, so it's better to monitor Forwarder: if you have an alert you have two choices:

  • forwarder down, and you must interviene otherwise you're blind,
  • host down, and you must intervene.

in both cases you must intervene!

If you want, but I don't hint this: instead of the index=_internal, you can use index=* but it's the same thing.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...