Splunk ITSI

Splunk Health Check Inquiry

lloydknight
Builder

Hello,

So we ran a health check on our Splunk Search Head (w/ Splunk ITSI) and indexers. Though the results were pretty straightforward, would still like to know your inputs, assessments, and if possible, recommendation on this.

Current Situation:
Our Splunk ITSI is experiencing some frequent N/A KPIs which in my assessment would be the skipped searches due to 1000+ KPIs were already created (that's 1000+ saved searches every 5 minutes) and some Forwarders are not following the interval on the inputs.conf (from Add-On Apps, enabled scripted inputs especially on Windows)

Architecture:
3 Indexers
2 Search Heads (not clustered)

Search Head Statuses:
Search Head 1 is for miscellaneous reporting
- 5-10 concurrent users and some are viewing dashboards with expensive searches - (some searches are using summary data, but most are raw data)

Search Head 2 is dedicated for Splunk ITSI
- with 1200+ KPIs defined (scheduled searches) and Jobs are averaging from 700-1000+
- occasional warnings on dispatch directory and as observed, if many users are using, Searches are queuing which is to be expected.

Health Checks result:

Search Head w/ ITSI
alt text

Indexers
alt text

Thoughts?

Much appreciated!

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

Thank you for such a complete, well done question!

While there may be may be a variety of answers to this question, I'd start in on the THP. Transparent Huge Pages are a memory optimization that's for workloads other than the workloads Splunk tends to impose on its servers. The most excellent docs outline the THP situation fairly well.

As to how to turn it off, you'll have to look at the documentation for whichever linux you are using. Each distribution is different and in fact, the version is important because that's a moving target as well. We can probably give some help if you can't find the right docs (or if follow them but it still shows THP turned on), but I'd give that a shot first.

Second, if there's still a problem after turning off THP, we can look into those "resource limits set below recommendations". Or maybe we should do that anyway, but you'd have to just dig into that search and find out WHICH resource is set below recommendation and fix it.

Hope this helps!
-Rich

View solution in original post

0 Karma

Richfez
SplunkTrust
SplunkTrust

Thank you for such a complete, well done question!

While there may be may be a variety of answers to this question, I'd start in on the THP. Transparent Huge Pages are a memory optimization that's for workloads other than the workloads Splunk tends to impose on its servers. The most excellent docs outline the THP situation fairly well.

As to how to turn it off, you'll have to look at the documentation for whichever linux you are using. Each distribution is different and in fact, the version is important because that's a moving target as well. We can probably give some help if you can't find the right docs (or if follow them but it still shows THP turned on), but I'd give that a shot first.

Second, if there's still a problem after turning off THP, we can look into those "resource limits set below recommendations". Or maybe we should do that anyway, but you'd have to just dig into that search and find out WHICH resource is set below recommendation and fix it.

Hope this helps!
-Rich

0 Karma
Get Updates on the Splunk Community!

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...

Alerting Best Practices: How to Create Good Detectors

At their best, detectors and the alerts they trigger notify teams when applications aren’t performing as ...

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...