We are experiencing issues with services' health score alternating between 0 and 100 in the Service Analyzer in ITSI.
The health scores shows 0 even though all the underlying KPIs are ok. This happens for all of our defined services. The simplest case is shown below. Here we have a service "Azure Status" with only one defined KPI: "AzureStatus".
We recently updated to 3.0.0, but experienced the same issue before the upgrade (version 2.4.0).
Anyone ideas what would cause this or what the issue is?
Can you move your "Azure Status" service to a glass table icon and see if your still getting zero? This will tell us if its a Service Analyzer or ITSI issue
It looks to be alternating. The KPI's value is constant, but the health is switching from 100 to 0 at random intervals.
What's interesting, I tried adding some of the other services health scores to the same glass table, and all the scores are alternating between 0 and 100 at the exact same time. And there are no defined dependencies between them.
index=azure host=azurerss sourcetype=azurestatus
| eval value=if(StatusMessage="An issue has been discovered",0,1)
Threshold field: value
Split by entity: No
Calculating Average of aggregate over the last 15 minute(s) every 5 minutes.
I see the issue.. You are returning a value of 0 if the condition is true and returning a value of 1 if the condition is false. When ITSI is averaging the two values, it will never work out correctly.
A better approach would be to not average the results but rather sum them over the 5 minute span and if the count goes over a specified threshold, it can change the color of the KPI.
If you take this approach then your eval should look like this
| eval value=if(StatusMessage="An issue has been discovered",1,0)