Splunk IT Service Intelligence

Splunk Health Check Inquiry

lloydknight
Builder

Hello,

So we ran a health check on our Splunk Search Head (w/ Splunk ITSI) and indexers. Though the results were pretty straightforward, would still like to know your inputs, assessments, and if possible, recommendation on this.

Current Situation:
Our Splunk ITSI is experiencing some frequent N/A KPIs which in my assessment would be the skipped searches due to 1000+ KPIs were already created (that's 1000+ saved searches every 5 minutes) and some Forwarders are not following the interval on the inputs.conf (from Add-On Apps, enabled scripted inputs especially on Windows)

Architecture:
3 Indexers
2 Search Heads (not clustered)

Search Head Statuses:
Search Head 1 is for miscellaneous reporting
- 5-10 concurrent users and some are viewing dashboards with expensive searches - (some searches are using summary data, but most are raw data)

Search Head 2 is dedicated for Splunk ITSI
- with 1200+ KPIs defined (scheduled searches) and Jobs are averaging from 700-1000+
- occasional warnings on dispatch directory and as observed, if many users are using, Searches are queuing which is to be expected.

Health Checks result:

Search Head w/ ITSI
alt text

Indexers
alt text

Thoughts?

Much appreciated!

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

Thank you for such a complete, well done question!

While there may be may be a variety of answers to this question, I'd start in on the THP. Transparent Huge Pages are a memory optimization that's for workloads other than the workloads Splunk tends to impose on its servers. The most excellent docs outline the THP situation fairly well.

As to how to turn it off, you'll have to look at the documentation for whichever linux you are using. Each distribution is different and in fact, the version is important because that's a moving target as well. We can probably give some help if you can't find the right docs (or if follow them but it still shows THP turned on), but I'd give that a shot first.

Second, if there's still a problem after turning off THP, we can look into those "resource limits set below recommendations". Or maybe we should do that anyway, but you'd have to just dig into that search and find out WHICH resource is set below recommendation and fix it.

Hope this helps!
-Rich

View solution in original post

0 Karma

Richfez
SplunkTrust
SplunkTrust

Thank you for such a complete, well done question!

While there may be may be a variety of answers to this question, I'd start in on the THP. Transparent Huge Pages are a memory optimization that's for workloads other than the workloads Splunk tends to impose on its servers. The most excellent docs outline the THP situation fairly well.

As to how to turn it off, you'll have to look at the documentation for whichever linux you are using. Each distribution is different and in fact, the version is important because that's a moving target as well. We can probably give some help if you can't find the right docs (or if follow them but it still shows THP turned on), but I'd give that a shot first.

Second, if there's still a problem after turning off THP, we can look into those "resource limits set below recommendations". Or maybe we should do that anyway, but you'd have to just dig into that search and find out WHICH resource is set below recommendation and fix it.

Hope this helps!
-Rich

0 Karma
Get Updates on the Splunk Community!

How I Instrumented a Rust Application Without Knowing Rust

As a technical writer, I often have to edit or create code snippets for Splunk's distributions of ...

Splunk Community Platform Survey

Hey Splunk Community, Starting today, the community platform may prompt you to participate in a survey. The ...

Observability Highlights | November 2022 Newsletter

 November 2022Observability CloudEnd Of Support Extension for SignalFx Smart AgentSplunk is extending the End ...