Splunk IT Service Intelligence

Can I use a bucket on a service KPI search to count distinct hosts active over 20m window? It always shows '0'

VexenCrabtree
Path Finder

I think I might be doing something conceptually wrong with this, but I've tried several combinations throughout the day today, and not managed to get it right.

I'm configuring a Service KPI. The intent is to count total active&standby core routers on a network, using the presence of some occasional tunnel traffic.

Here's the search I'm using for the KPI, with a "distinct count" calculation (with some real data masked into XXXX,YYYY,ZZZZ):

 index=XXXX host=YYYY sourcetype=ZZZZ
| fields host
| bucket _time span=20m

The correct answer is "33". Adding things like stats distinct_count(host) and timechart span=20m (etc), I can see 33 active hosts within the 20m window when using normal searches, consistently, for the past year.

But when I use the quoted search (and other variants) as a KPI, it always shows "0". I feel like my fundamental approach is wrong, but I can't hit the nail on the head, and could do with some pointers/ideas!

0 Karma
1 Solution

dmillis
Splunk Employee
Splunk Employee

Creating a KPI search can be tricky. I recommend creating a search in the Search Bar which ends with "| stats ", because this is essentially how the KPI functions. Then copy the search (without the "| stats ..." to use as the ad-hoc search for your KPI. Your example search includes " | bucket _time span=20m", which doesn't really make sense if it is followed by "| stats ...".

Let's assume that the following search would produce the results you want for the most recent 20min period:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now
| stats dc(host)

You could convert this into a KPI by using the following as the ad-hoc search:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now

And then set the "threshold field" as 'host', set the aggregate calculation to Distinct Count, and set the KPI schedule to "every 5 minutes". This would create a KPI which updates every 5 min, with overlapping 20 min search periods. Or you could use non-overlapping periods as long as you are OK with 1, 5 or 15 min periods (which are the available scheduled search intervals). In essence, the KPI is "bucketing" your results by running the search on a scheduled basis.
I hope this helps.

View solution in original post

dmillis
Splunk Employee
Splunk Employee

Creating a KPI search can be tricky. I recommend creating a search in the Search Bar which ends with "| stats ", because this is essentially how the KPI functions. Then copy the search (without the "| stats ..." to use as the ad-hoc search for your KPI. Your example search includes " | bucket _time span=20m", which doesn't really make sense if it is followed by "| stats ...".

Let's assume that the following search would produce the results you want for the most recent 20min period:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now
| stats dc(host)

You could convert this into a KPI by using the following as the ad-hoc search:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now

And then set the "threshold field" as 'host', set the aggregate calculation to Distinct Count, and set the KPI schedule to "every 5 minutes". This would create a KPI which updates every 5 min, with overlapping 20 min search periods. Or you could use non-overlapping periods as long as you are OK with 1, 5 or 15 min periods (which are the available scheduled search intervals). In essence, the KPI is "bucketing" your results by running the search on a scheduled basis.
I hope this helps.

VexenCrabtree
Path Finder

Thanks a lot, that was a very clear description. By using earliest= and latest=, I didn't need to confuse my Ad Hoc search with the other elements I was trying to use.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I'm assuming your querying the itsi_summary and the data has already been processed by ITSI? What frequency is ITSI running for this service? You're trying to count the host values in the isti summary index? Since its writing its data to a summary index, the host values will just show your ITSI search heads. You should use splunk_server field if you want the hosts that were queried prior to it filling the summary index

0 Karma
Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...