Splunk Search

service status of a service including Disaster Recovery situation

ashraf_sj
Explorer

Just in a situation where I have 2 servers, where 1 is active and the other is passive. I had to deploy the TA on both the servers and report the service status of a service.

So the active server would be reporting the service is "Running" and the passive server would say the service is "stopped"

I have tried writing up a SPL but my only worry is if there is a situation when the service stops on the active server how to get it reported. or if there is no data from the active server. There should be atleast 1 server reporting the service is "Running" always. Only during the DR situation the server name would change

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
| stats values(host) AS active_host BY status
| where status=="Running"
| append
[ search index = mday source =service_status.ps1 sourcetype = service_status os_service="App_Service" host=*papp01
| stats latest(status) AS status by host,os_service,service_name ]
| filldown active_host
| where active_host=host AND status!="Running"
| table host,active_host,os_service,service_name,status

 

Any help is much appreciated

Labels (1)
0 Karma
1 Solution

renjith_nair
Legend

There are multiple methods to achieve this. However, lets first try it in a simpler way

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
|stats latest(status) AS status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Steps

- count the status values

- If the count is less than 2  : meaning only one of the status from Running/Stopped is present

- OR Running status is not available, we are setting the overall status as down.

In this way, we can handle multiple situations where one of the server is down or both are reporting down or even both are reporting Running (active & passive)

Demonstrated with a dummy search

|makeresults|eval host="HostA",status="Running"
|append[|makeresults|eval host="HostB",status="Stopped"]
|stats latest(status) as status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Try changing the status of HostA or HostB and see the results.

 

---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

There are multiple methods to achieve this. However, lets first try it in a simpler way

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
|stats latest(status) AS status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Steps

- count the status values

- If the count is less than 2  : meaning only one of the status from Running/Stopped is present

- OR Running status is not available, we are setting the overall status as down.

In this way, we can handle multiple situations where one of the server is down or both are reporting down or even both are reporting Running (active & passive)

Demonstrated with a dummy search

|makeresults|eval host="HostA",status="Running"
|append[|makeresults|eval host="HostB",status="Stopped"]
|stats latest(status) as status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Try changing the status of HostA or HostB and see the results.

 

---
What goes around comes around. If it helps, hit it with Karma 🙂

ashraf_sj
Explorer

Thanks @renjith_nair , this works, I have used the OverallStatus as condition to alert. Thanks a lot and much appreciated. 

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...