Splunk IT Service Intelligence

ITSI - Lag between KPI and Service Health Score

briannorthey
Engager

We're observing a lag between when the KPI data hits a threshold and when the KPI severity level changes color - and an even longer lag for when the Service changes color.

Example:
- Using Real Time for all Service Analyzers and Glass Tables
- KPI Search Schedule = 1 minute
- KPI changes color = 2 - 4 minutes
- Service changes color = 1 - 2 minutes after the KPI changes

The net result is a considerable lag on Glass Tables where only the Service Health Score is displayed.

Is there a way to change the configuration so there is less of a lag?

I understand a lag of up to 2 minutes (based on the KPI Search Schedule) but having a lag of up to 6 minutes on the Glass Table is not effective for our support teams.

yannK
Splunk Employee
Splunk Employee

This is not surprising.

  • data change on disk
  • the KPI run and update their summary values at best every minute - delay
  • the kpi is indexed
  • the service score is calculated based on the kpis values from the previous minute (as the current minute may not be indexed) + delay
  • the service score is indexed
  • if the service score has a dependency over another service, another minute of delay to wait for those dependent services healthscores + extra delay

also if you have indexing/forwarding slowness between the SH and the indexers, add some delay.
So it could take 2-3 minutes for the service score to flip.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...