Hi everyone! Here are a few questions from the session, as well as the link to the on-demand recording (get the full Q&A deck and recording in the #office-hours Slack channel as well)
Q1:
a) Can you show us some best in-class examples of executive dashboards made in ITSI?
b) Interested in examples of Executive Dashboards for visualizing/comparing the health of a set of critical services.
Answer
Focus on Executive-Relevant Metrics High-level KPIs like service health scores, availability, and business impact indicators.
Clear & Intuitive Visualizations Real-time data, color-coded health status, and clickable drill-downs.
Show Service Relationships & Dependencies Visualize dependencies to understand impact across services.
Provide Business Context Correlate IT health with business outcomes and customer experience.
Keep Interface Simple & Focused Limit KPIs and services shown; use summary views with drill-down options.
This approach empowers executives with a clear, actionable view of critical service health for informed decision-making and rapid response.
Documentation:
Splunk Docs: Glass Tables
The Top 10 Glass Table/Dashboard Design Principles to Boost Your Career and Your Business
Q2: Can you give an overview of Event Analytics and demonstrate how Event IQ adds value here?
Answer:
Event Analytics (EA) is an alarm aggregation platform that is a companion tool to Service Analytics but may be used separately as a Monitor of Monitors
Event analytics accepts alarm conditions ("notable events") from Service Analyzer as well as external (non-ITSI) tools:
Splunk Alerts
Splunk O11y Cloud
Non-Splunk (3rd party) monitoring tools and solutions
EA performs alarm deduplication and grouping into "episodes" (bundled related alarms) which can trigger any defined external action via an ITSM tool, Splunk alert actions, Splunk On-Call, messaging apps and tools, orchestration software...
Event IQ is a new ML-powered feature in EA to perform intelligent alarm grouping into episodes with minimal-preconfiguration or understanding of alarm source relationships
Currently: define key fields and their perceived values to be weighted in alarm grouping decisions
Coming: full field discovery for grouping/weighting purposes
Documentation:
Event Analytics: https://docs.splunk.com/Documentation/ITSI/4.20.1/EA/AboutEA Event IQ: https://help.splunk.com/en/splunk-it-service-intelligence/splunk-it-service-intelligence/detect-and-act-on-notable-events/4.21/event-correlation/automate-event-correlation-with-event-iq-in-itsi
Q3: How do you create a KPI without using a content pack, such as AppDynamics?
Answer
KPI creation and configuration is done via the Service configuration UI as well as the Service Template configuration UI. High level config steps:
Step
Task
Description
Optional/Required
1
Define a KPI source search
A search string that you define as the basis for your KPI, using a data model, an ad hoc search, a metrics search, or a base search.
Required
2
Split and filter by entities
Break down the KPI to apply the search to multiple entities, enabling comparative analysis of search results on a per-entity basis. Filter entities in or out of the KPI search.
Optional
3
Configure KPI monitoring calculations
The recurring KPI search schedule and the statistical operations performed on the search results, including service health score calculations.
Required
4
Define KPI unit and monitoring lag
Define the unit of measurement to display for the KPI. Configure the monitoring lag to offset indexing lag.
Optional
5
Enable backfill
Fills the summary index with historical raw service health score data.
Optional
6
Configure KPI thresholds
Severity-level thresholds that you apply to KPI search results. Thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.
Required
7
Configure KPI thresholds with machine learning in ITSI
Use machine learning to analyze your KPIs with existing data and generate recommendations for optimal threshold values. Thresholds let you monitor KPI status (normal, low, medium, high, and critical) and set trigger conditions for alerts.
Optional
Documentation:
Splunk Docs: Create KPIs
Splunk Lantern: Using SRE golden signals for KPIs
Question 4: What is “drift detection” and how is that used?
Answer:
Drift detection is used to identify changes in KPI behavior that happen over a long period of time not normally captured by Adaptive Thresholding – the "frog in the boiling pot of water" scenario
Adaptive Thresholds: training max 60 days
Drift Detection: training min 90 days
Documentation:
Splunk Docs: Configure Drift Detection
Splunk Docs: Migrate Anomaly Detection to Adaptive Thresholding
Q5: What are best practices to deal with events as fast as they are generated?
Answer
Identify the alert data sources
If ITSI supports alerts data integration for these sources (i.e., tools), use those ootb onboarding and normalization templates to normalize alerts
Use lookup-based enrichment policies for enriching alerts from sources such as CMDB
Use ootb NEAPs provided in Content Pack for monitoring and alerting for aggregating / grouping alerts
Be on ITSI 4.21 version to make sure of these latest capabilities incl. New queue-based event pipeline that brings higher scale and reduced latency
Documentation:
https://help.splunk.com/en/splunk-it-service-intelligence/splunk-it-service-intelligence/detect-and-act-on-notable-events/4.21/overview/overview-of-event-analytics-in-itsi
Question 6: Can we learn more about ITSI integrations with Teams and Slack?
Answer
Splunk ITSI’s Teams and Slack integrations are Splunk-supported & maintained, available on the Splunkbase website.
The integrations (content packs) do way more than help notify teams:
They streamline automatic alerting for correlated events and service health notifications
They support actions and workflows within Teams and Slack, like posting messages, getting info about users, or even starting a SlackBot and making health checks
In Teams, configure potential actions that allow for 3rd-party interactions between Splunk or a 3rd-party app from Teams
Documentation:
Gain quick access w/ the Splunk App for Content Packs
Splunkbase apps for Slack, and Teams
(Check here for troubleshooting Teams)
... View more