Knowledge Management

Why doesn't summary index data get written sometimes?

Sharzi
Explorer

Hi,

I've faced an issue with summary indexing since last week. I have around 25 saved searches running 15 mins past the hour and save the results in the Summary index. Based on the jobs, these searches run fine with no error, but sometimes, summary index data is not written for some of these searches. 

I check the _internal log and found the following:

 

reason=The maximum number of concurrent historical scheduled searches on this instance has been reached, Status=Continued

 

The concurrency limit on the search head is 39, and I changed max_searches_per_cpu to 2 on both SH and indexers, but no improvement!

My issue is similar to this post, but the solution is for version 5.

Could someone please help me with this?
Thank you!

Labels (1)
0 Karma

efavreau
Motivator

@Sharzi  There's no silver bullet to these types of things. It's a lot of trial and error. @VatsalJagani offered some good tips. We also found sometimes using the AUTO window when scheduling (with the cron), have some queries send to summary index once a week (instead of twice a day), making sure we look at conflicting schedules, s not just our jobs in our app, but all jobs across all apps, etc.
After support tickets, with band-aids like this, things became worse. We diagnosed the hardware and found contention on some KPI's. We upgraded the hardware and the issues went away.
Good luck!

###

If this reply helps you, an upvote would be appreciated.
0 Karma

Sharzi
Explorer

Thank you @efavreau!

It might be the case since we recently reduced the instance type of several of our nodes from c5.9xlarge  to c5.4xlarge  (due to very low CPU usage and "over-provisioned" flag), and there is also the IOWait warning. So I'm wondering what you upgraded to solve the issue.

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@Sharzi - This is usually caused by so many scheduled reports/alerts executing at the same time. Try working on the cron to make sure alerts/reports execute at the different minutes of the hour.

ex. few alerts to execute at 1st minute of an hour, more at 3rd minute of an hour, and so on rather than running all reports at 0th minute of an hour.

 

Solution in case there are still skipped report execution:

 

I hope this helps, kindly upvote if it does!!!

Sharzi
Explorer

Hi @VatsalJagani ,

Thank you for your reply.

We spread out the schedule, but we still have the same issue. Since the searches are running with no error, would you think the backfill script solves this issue? 

Thanks,

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

This is resource issue, so backfill script is not a solution, rather patch to fix issue created by resource limitation.

So, you definetly have to fix the resource issue.

 

I hope this helps!!!

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...