Knowledge Management

Why doesn't summary index data get written sometimes?

Sharzi
Explorer

Hi,

I've faced an issue with summary indexing since last week. I have around 25 saved searches running 15 mins past the hour and save the results in the Summary index. Based on the jobs, these searches run fine with no error, but sometimes, summary index data is not written for some of these searches. 

I check the _internal log and found the following:

 

reason=The maximum number of concurrent historical scheduled searches on this instance has been reached, Status=Continued

 

The concurrency limit on the search head is 39, and I changed max_searches_per_cpu to 2 on both SH and indexers, but no improvement!

My issue is similar to this post, but the solution is for version 5.

Could someone please help me with this?
Thank you!

Labels (1)
0 Karma

efavreau
Motivator

@Sharzi  There's no silver bullet to these types of things. It's a lot of trial and error. @VatsalJagani offered some good tips. We also found sometimes using the AUTO window when scheduling (with the cron), have some queries send to summary index once a week (instead of twice a day), making sure we look at conflicting schedules, s not just our jobs in our app, but all jobs across all apps, etc.
After support tickets, with band-aids like this, things became worse. We diagnosed the hardware and found contention on some KPI's. We upgraded the hardware and the issues went away.
Good luck!

###

If this reply helps you, an upvote would be appreciated.
0 Karma

Sharzi
Explorer

Thank you @efavreau!

It might be the case since we recently reduced the instance type of several of our nodes from c5.9xlarge  to c5.4xlarge  (due to very low CPU usage and "over-provisioned" flag), and there is also the IOWait warning. So I'm wondering what you upgraded to solve the issue.

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@Sharzi - This is usually caused by so many scheduled reports/alerts executing at the same time. Try working on the cron to make sure alerts/reports execute at the different minutes of the hour.

ex. few alerts to execute at 1st minute of an hour, more at 3rd minute of an hour, and so on rather than running all reports at 0th minute of an hour.

 

Solution in case there are still skipped report execution:

 

I hope this helps, kindly upvote if it does!!!

Sharzi
Explorer

Hi @VatsalJagani ,

Thank you for your reply.

We spread out the schedule, but we still have the same issue. Since the searches are running with no error, would you think the backfill script solves this issue? 

Thanks,

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

This is resource issue, so backfill script is not a solution, rather patch to fix issue created by resource limitation.

So, you definetly have to fix the resource issue.

 

I hope this helps!!!

Get Updates on the Splunk Community!

2024 Splunk Career Impact Survey | Earn a $20 gift card for participating!

Hear ye, hear ye! The time has come again for Splunk's annual Career Impact Survey!  We need your help by ...

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...