Splunk IT Service Intelligence

ITSI - Lag between KPI and Service Health Score

briannorthey
Engager

We're observing a lag between when the KPI data hits a threshold and when the KPI severity level changes color - and an even longer lag for when the Service changes color.

Example:
- Using Real Time for all Service Analyzers and Glass Tables
- KPI Search Schedule = 1 minute
- KPI changes color = 2 - 4 minutes
- Service changes color = 1 - 2 minutes after the KPI changes

The net result is a considerable lag on Glass Tables where only the Service Health Score is displayed.

Is there a way to change the configuration so there is less of a lag?

I understand a lag of up to 2 minutes (based on the KPI Search Schedule) but having a lag of up to 6 minutes on the Glass Table is not effective for our support teams.

yannK
Splunk Employee
Splunk Employee

This is not surprising.

  • data change on disk
  • the KPI run and update their summary values at best every minute - delay
  • the kpi is indexed
  • the service score is calculated based on the kpis values from the previous minute (as the current minute may not be indexed) + delay
  • the service score is indexed
  • if the service score has a dependency over another service, another minute of delay to wait for those dependent services healthscores + extra delay

also if you have indexing/forwarding slowness between the SH and the indexers, add some delay.
So it could take 2-3 minutes for the service score to flip.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...