Observability: Splunk IT Service Intelligence (ITSI)

ArifV · ‎12-03-2025

Hi everyone! Here are a few questions from the session, as well as the link to the on-demand recording (get the full Q&A deck and recording in the #office-hours Slack channel as well)

Q1:

a) Can you show us some best in-class examples of executive dashboards made in ITSI?

b) Interested in examples of Executive Dashboards for visualizing/comparing the health of a set of critical services.

Answer

Focus on Executive-Relevant Metrics
High-level KPIs like service health scores, availability, and business impact indicators.
Clear & Intuitive Visualizations
Real-time data, color-coded health status, and clickable drill-downs.
Show Service Relationships & Dependencies
Visualize dependencies to understand impact across services.
Provide Business Context
Correlate IT health with business outcomes and customer experience.
Keep Interface Simple & Focused
Limit KPIs and services shown; use summary views with drill-down options.

This approach empowers executives with a clear, actionable view of critical service health for informed decision-making and rapid response.

Documentation:

Q2: Can you give an overview of Event Analytics and demonstrate how Event IQ adds value here?

Answer:

Event Analytics (EA) is an alarm aggregation platform that is a companion tool to Service Analytics but may be used separately as a Monitor of Monitors
- Event analytics accepts alarm conditions ("notable events") from Service Analyzer as well as external (non-ITSI) tools:
  - Splunk Alerts
  - Splunk O11y Cloud
  - Non-Splunk (3rd party) monitoring tools and solutions
  - EA performs alarm deduplication and grouping into "episodes" (bundled related alarms) which can trigger any defined external action via an ITSM tool, Splunk alert actions, Splunk On-Call, messaging apps and tools, orchestration software...

Event IQ is a new ML-powered feature in EA to perform intelligent alarm grouping into episodes with minimal-preconfiguration or understanding of alarm source relationships
- Currently: define key fields and their perceived values to be weighted in alarm grouping decisions
- Coming: full field discovery for grouping/weighting purposes

Documentation:

Event Analytics: https://docs.splunk.com/Documentation/ITSI/4.20.1/EA/AboutEA
Event IQ: https://help.splunk.com/en/splunk-it-service-intelligence/splunk-it-service-intelligence/detect-and-act-on-notable-events/4.21/event-correlation/automate-event-correlation-with-event-iq-in-itsi

Q3: How do you create a KPI without using a content pack, such as AppDynamics?

Answer

KPI creation and configuration is done via the Service configuration UI as well as the Service Template configuration UI. High level config steps:

Step	Task	Description	Optional/Required
1	Define a KPI source search	A search string that you define as the basis for your KPI, using a data model, an ad hoc search, a metrics search, or a base search.	Required
2	Split and filter by entities	Break down the KPI to apply the search to multiple entities, enabling comparative analysis of search results on a per-entity basis. Filter entities in or out of the KPI search.	Optional
3	Configure KPI monitoring calculations	The recurring KPI search schedule and the statistical operations performed on the search results, including service health score calculations.	Required
4	Define KPI unit and monitoring lag	Define the unit of measurement to display for the KPI. Configure the monitoring lag to offset indexing lag.	Optional
5	Enable backfill	Fills the summary index with historical raw service health score data.	Optional
6	Configure KPI thresholds	Severity-level thresholds that you apply to KPI search results. Thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.	Required
7	Configure KPI thresholds with machine learning in ITSI	Use machine learning to analyze your KPIs with existing data and generate recommendations for optimal threshold values. Thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.	Optional

Documentation:

Question 4: What is “drift detection” and how is that used?

Answer:

Drift detection is used to identify changes in KPI behavior that happen over a long period of time not normally captured by Adaptive Thresholding – the "frog in the boiling pot of water" scenario
Adaptive Thresholds: training max 60 days
Drift Detection: training min 90 days

Documentation:

Q5: What are best practices to deal with events as fast as they are generated?

Answer

Identify the alert data sources
If ITSI supports alerts data integration for these sources (i.e., tools), use those ootb onboarding and normalization templates to normalize alerts
Use lookup-based enrichment policies for enriching alerts from sources such as CMDB
Use ootb NEAPs provided in Content Pack for monitoring and alerting for aggregating / grouping alerts
Be on ITSI 4.21 version to make sure of these latest capabilities incl. New queue-based event pipeline that brings higher scale and reduced latency

Documentation:

https://help.splunk.com/en/splunk-it-service-intelligence/splunk-it-service-intelligence/detect-and-act-on-notable-events/4.21/overview/overview-of-event-analytics-in-itsi

Question 6: Can we learn more about ITSI integrations with Teams and Slack?

Answer

Splunk ITSI’s Teams and Slack integrations are Splunk-supported & maintained, available on the Splunkbase website.
The integrations (content packs) do way more than help notify teams:
- They streamline automatic alerting for correlated events and service health notifications
- They support actions and workflows within Teams and Slack, like posting messages, getting info about users, or even starting a SlackBot and making health checks
In Teams, configure potential actions that allow for 3rd-party interactions between Splunk or a 3rd-party app from Teams

Documentation:

Gain quick access w/ the Splunk App for Content Packs
Splunkbase apps for Slack, and Teams
(Check here for troubleshooting Teams)

Observability: Splunk IT Service Intelligence (ITSI)

Join the Conversation

Observability: Splunk IT Service Intelligence (ITSI)

Observability: Splunk IT Service Intelligence (ITSI)

2025

Observability

Past Office Hours