Splunk IT Service Intelligence

Can I use a bucket on a service KPI search to count distinct hosts active over 20m window? It always shows '0'

VexenCrabtree
Path Finder

I think I might be doing something conceptually wrong with this, but I've tried several combinations throughout the day today, and not managed to get it right.

I'm configuring a Service KPI. The intent is to count total active&standby core routers on a network, using the presence of some occasional tunnel traffic.

Here's the search I'm using for the KPI, with a "distinct count" calculation (with some real data masked into XXXX,YYYY,ZZZZ):

 index=XXXX host=YYYY sourcetype=ZZZZ
| fields host
| bucket _time span=20m

The correct answer is "33". Adding things like stats distinct_count(host) and timechart span=20m (etc), I can see 33 active hosts within the 20m window when using normal searches, consistently, for the past year.

But when I use the quoted search (and other variants) as a KPI, it always shows "0". I feel like my fundamental approach is wrong, but I can't hit the nail on the head, and could do with some pointers/ideas!

0 Karma
1 Solution

dmillis
Splunk Employee
Splunk Employee

Creating a KPI search can be tricky. I recommend creating a search in the Search Bar which ends with "| stats ", because this is essentially how the KPI functions. Then copy the search (without the "| stats ..." to use as the ad-hoc search for your KPI. Your example search includes " | bucket _time span=20m", which doesn't really make sense if it is followed by "| stats ...".

Let's assume that the following search would produce the results you want for the most recent 20min period:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now
| stats dc(host)

You could convert this into a KPI by using the following as the ad-hoc search:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now

And then set the "threshold field" as 'host', set the aggregate calculation to Distinct Count, and set the KPI schedule to "every 5 minutes". This would create a KPI which updates every 5 min, with overlapping 20 min search periods. Or you could use non-overlapping periods as long as you are OK with 1, 5 or 15 min periods (which are the available scheduled search intervals). In essence, the KPI is "bucketing" your results by running the search on a scheduled basis.
I hope this helps.

View solution in original post

dmillis
Splunk Employee
Splunk Employee

Creating a KPI search can be tricky. I recommend creating a search in the Search Bar which ends with "| stats ", because this is essentially how the KPI functions. Then copy the search (without the "| stats ..." to use as the ad-hoc search for your KPI. Your example search includes " | bucket _time span=20m", which doesn't really make sense if it is followed by "| stats ...".

Let's assume that the following search would produce the results you want for the most recent 20min period:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now
| stats dc(host)

You could convert this into a KPI by using the following as the ad-hoc search:

index=XXXX host=YYYY sourcetype=ZZZZ earliest=-20m latest=now

And then set the "threshold field" as 'host', set the aggregate calculation to Distinct Count, and set the KPI schedule to "every 5 minutes". This would create a KPI which updates every 5 min, with overlapping 20 min search periods. Or you could use non-overlapping periods as long as you are OK with 1, 5 or 15 min periods (which are the available scheduled search intervals). In essence, the KPI is "bucketing" your results by running the search on a scheduled basis.
I hope this helps.

VexenCrabtree
Path Finder

Thanks a lot, that was a very clear description. By using earliest= and latest=, I didn't need to confuse my Ad Hoc search with the other elements I was trying to use.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I'm assuming your querying the itsi_summary and the data has already been processed by ITSI? What frequency is ITSI running for this service? You're trying to count the host values in the isti summary index? Since its writing its data to a summary index, the host values will just show your ITSI search heads. You should use splunk_server field if you want the hosts that were queried prior to it filling the summary index

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...