Splunk Search

service status of a service including Disaster Recovery situation

ashraf_sj
Explorer

Just in a situation where I have 2 servers, where 1 is active and the other is passive. I had to deploy the TA on both the servers and report the service status of a service.

So the active server would be reporting the service is "Running" and the passive server would say the service is "stopped"

I have tried writing up a SPL but my only worry is if there is a situation when the service stops on the active server how to get it reported. or if there is no data from the active server. There should be atleast 1 server reporting the service is "Running" always. Only during the DR situation the server name would change

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
| stats values(host) AS active_host BY status
| where status=="Running"
| append
[ search index = mday source =service_status.ps1 sourcetype = service_status os_service="App_Service" host=*papp01
| stats latest(status) AS status by host,os_service,service_name ]
| filldown active_host
| where active_host=host AND status!="Running"
| table host,active_host,os_service,service_name,status

 

Any help is much appreciated

Labels (1)
0 Karma
1 Solution

renjith_nair
Legend

There are multiple methods to achieve this. However, lets first try it in a simpler way

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
|stats latest(status) AS status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Steps

- count the status values

- If the count is less than 2  : meaning only one of the status from Running/Stopped is present

- OR Running status is not available, we are setting the overall status as down.

In this way, we can handle multiple situations where one of the server is down or both are reporting down or even both are reporting Running (active & passive)

Demonstrated with a dummy search

|makeresults|eval host="HostA",status="Running"
|append[|makeresults|eval host="HostB",status="Stopped"]
|stats latest(status) as status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Try changing the status of HostA or HostB and see the results.

 

---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

There are multiple methods to achieve this. However, lets first try it in a simpler way

index=mday source="service_status.ps1" sourcetype=service_status os_service="App_Service" host=*papp01
|stats latest(status) AS status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Steps

- count the status values

- If the count is less than 2  : meaning only one of the status from Running/Stopped is present

- OR Running status is not available, we are setting the overall status as down.

In this way, we can handle multiple situations where one of the server is down or both are reporting down or even both are reporting Running (active & passive)

Demonstrated with a dummy search

|makeresults|eval host="HostA",status="Running"
|append[|makeresults|eval host="HostB",status="Stopped"]
|stats latest(status) as status by host
|eventstats values(status) as _status
|eval OverallStatus=if(mvcount(_status) < 2 OR isnull(mvfind(_status,"Running")),"Down","Good")

Try changing the status of HostA or HostB and see the results.

 

---
What goes around comes around. If it helps, hit it with Karma 🙂

ashraf_sj
Explorer

Thanks @renjith_nair , this works, I have used the OverallStatus as condition to alert. Thanks a lot and much appreciated. 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...