Hello,
I am working on a Distributed environment with:
- 1x SH with Splunk ES installed (Deployment Server)
- 7x Indexers (Search Peers)
On my SH, I see a lot of skipped executions on scheduled searches related to Splunk CIM app.
Specifically I see a 99% skip ratio to scheduled reports with a name format of:
_ACCELERATE_DM_Splunk_SA_CIM_Splunk_CIM_Validation.[Datamode_Name]_ACCELERATE_
I accessed the Data Models page and expanded the CIM Validation (S.o.S) data model. The information I got is:
"Access Count: 0 - Last Access: -) while size is 750MB and frequently updated.
My question:
Can I disable acceleration on this Data Model since it is never accessed?
Thank you in advance.
With kind regards,
Chris
Buon giorno Giuseppe,
I have managed to make my Splunk status green by doing the following:
1. Fixing the default tags on Splunk CIM
| rest splunk_server=remote* servicesNS/-/-/saved/eventtypes
| search tags=*
| table eai:acl.app, eai:acl.sharing eai:acl.perms.read, title, search, tags, author
this search helped me identify which are the tags that I should whitelist in each datamodel. Indexes were already set in macros but tags seemed to be completely wrong.
2. Fixing my ulimits
The fsize line was missing, which was the one that fixed my open files warning.
* hard nofile 64000 * hard nproc 16000 * hard fsize -1
I have checked all my scheduled searches one by one and they were optimized (search window: auto, no real-time searches).
3. Minimized the summary indexing according to needs
Some datamodels were set to create a summary index for a period that I did not need (eg. 1 year). So changing this to a smaller range might have helped too.
Hardware resource consumption was and still seems to be in low levels, but an upgrade has to be performed for sure.
Thanks a lot for your support.
With kind regards,
Chris
Hi @IoannisG,
at first, the ES Search Head must be a dedicated server and you cannot use it also as Deployment Server, in addition, if your Deployment Server has more than 50 clients, it requires a dedicated server.
Then, what are the resources of your SH and Indexers?
Remember that the minimum reference hardware is:
for an ES Search Head:
for Mid Tier Indexers:
Anyway, you can disable the above acceleration but the warning you are receiving is only an alert of a situation, in other words, if you haven't sufficient resources, if you disable this acceleration, probably you'll have a similar message for another acceleration.
So start to check your resources, then, using the Monitor Console, see if there is some heavy scheduled search that gives problem to your system.
At least, if the resources are correct and there isn't any heavy scheduled search, open a Case to Splunk Support.
Ciao.
Giuseppe
Ciao Guiseppe,
my resources (CPU) are less on Search Peers than the recommended ones but I am aware about it already. Specifically:
Search Head: 32 vCPUs - 128GB RAM
Search Peers: 6vCPUs - 40GB RAM
I have less than 10 clients/indexers in total.
Transparent Huge Pages and ulimits are optimized on all instances and health checks are green (except skipped searches).
Things I have manually changed/configured so far that might affect:
Search Head & Search Peers - limits.conf
base_max_searches = 10
On both Search Head and Search Peers:
Relative concurrency limit for scheduled searches: 60
Relative concurrency limit for summarization searches: 100
I have disabled acceleration for Splunk CIM and for around an hour now the aggregate search runtime has dramatically fallen (before the datamodel acceleration searches from Splunk CIM were delaying the runtime). I still see the red exclamation warning though but I assume it is using historical data so maybe I shall wait a bit, right?
Thanks again.
Chris
Hi @IoannisG,
probably the problem is the number of CPUs in your Indexers: remember that each search (and each subsearch) takes a CPU and release it only when finished.
Then how many logs you index every day?
Using ES you should have at max an indexer for every 80-100 GB/day, so if you dayly index 1TB of logs, you need at least 10 Indexers with the reference hardware I described.
Could you have more CPUs?
You should also have errors from ES and Health Check for the limited number of CPUs.
You spoke about THP, please check if it's disabled.
Anyway, I'm pretty sure that the problem are the CPUs, and opening a Case to Splunk Support, surely they will answer in the same way.
Ciao.
Giuseppe
Hello again Giuseppe,
I have an average indexing of 8-10GB per day eventy distributed on my clients/indexers.
THP is disabled and confirmed via Health Check.
I have checked my Scheduler Activity: Instance dashboards and things seem to be much better. Splunk_CIM with acceleration disabled has decluttered the dashboards and I still don't see any other data model having a skipped search. Since I disabled it, I have zero skipped scheduled searches.
I will wait a bit more and see how it goes.
Many thanks,
Chris
PS: CPU upgrade is being planned soon 🙂 But it's better to search for a possible misconfiguration first rather than adding CPUs and hiding the underlying problem.
Hi @IoannisG,
the use of the reference hardware is useful even if your system is underused because it's the first notation when you open a Case to Splunk Support.
let me know if you solved.
Ciao and happy splunking.
Giuseppe
Buon giorno Giuseppe,
I have managed to make my Splunk status green by doing the following:
1. Fixing the default tags on Splunk CIM
| rest splunk_server=remote* servicesNS/-/-/saved/eventtypes
| search tags=*
| table eai:acl.app, eai:acl.sharing eai:acl.perms.read, title, search, tags, author
this search helped me identify which are the tags that I should whitelist in each datamodel. Indexes were already set in macros but tags seemed to be completely wrong.
2. Fixing my ulimits
The fsize line was missing, which was the one that fixed my open files warning.
* hard nofile 64000 * hard nproc 16000 * hard fsize -1
I have checked all my scheduled searches one by one and they were optimized (search window: auto, no real-time searches).
3. Minimized the summary indexing according to needs
Some datamodels were set to create a summary index for a period that I did not need (eg. 1 year). So changing this to a smaller range might have helped too.
Hardware resource consumption was and still seems to be in low levels, but an upgrade has to be performed for sure.
Thanks a lot for your support.
With kind regards,
Chris
Hi @IoannisG,
good for you, let me know if you need more help, otherwise, please accept the answer for the other people of Community.
Ciao and happy splunking.
Giuseppe