OK, so I've got some weirdness going on with KPIs in ITSI.
I have a set of alerting data coming that only gives me a record each time there is a state change. So I have to do some jigging around to get the last valid record and do a count of Critical / Warnings.
My search goes like this;
tag=Geneos_Severity_Alerts
| eval host=if($data.row.probe$="GW Data",replace($data.row.cell$,"_INF / probeStatus",""),$data.row.probe$)
| stats last(host) as host last(data.row.severity) as severity last(data.row.NAR-ID) as NARID last(operation) as operation by data.name
| search operation!=delete
| eval critical=if(severity="Critical",1,0)
| eval warning=if(severity="Warning",1,0)
I've checked this out ad-hoc, and all picks up the right set of data, and gives me a straightforward 1 or 0 to use in a sum to get counts on a KPI. The host represents the entities I have on the Services for filtering.
The weirdness is this; I've set up the KPI Base search using the above. Then applying it to a Service, I simply get an incorrect result. Same timespan, same filtering, same calculation. If I go into a deep dive, I can see the correct result by Entity. If I open the search there — correct result. If I flip the KPI setup against the Service to Ad-Hoc search — which then just uses the KPI base search long-hand, without me touching it — correct result.
Base Search: Nope. No dice. 2+1+1 apparently = 9. Just wrong
Now I have had this search (and variations of it) working OK, but as we're in dev, I've had to delete out and recreate a bunch of Services / Entities.
Is this possibly a hangover of old data?
... View more