Why doesn't summary index data get written sometim...

Sharzi · ‎04-17-2023

Hi,

I've faced an issue with summary indexing since last week. I have around 25 saved searches running 15 mins past the hour and save the results in the Summary index. Based on the jobs, these searches run fine with no error, but sometimes, summary index data is not written for some of these searches.

I check the _internal log and found the following:

reason=The maximum number of concurrent historical scheduled searches on this instance has been reached, Status=Continued

The concurrency limit on the search head is 39, and I changed max_searches_per_cpu to 2 on both SH and indexers, but no improvement!

My issue is similar to this post, but the solution is for version 5.

Could someone please help me with this?
Thank you!

efavreau · ‎04-19-2023

@Sharzi There's no silver bullet to these types of things. It's a lot of trial and error. @VatsalJagani offered some good tips. We also found sometimes using the AUTO window when scheduling (with the cron), have some queries send to summary index once a week (instead of twice a day), making sure we look at conflicting schedules, s not just our jobs in our app, but all jobs across all apps, etc.
After support tickets, with band-aids like this, things became worse. We diagnosed the hardware and found contention on some KPI's. We upgraded the hardware and the issues went away.
Good luck!

###

If this reply helps you, an upvote would be appreciated.

Sharzi · ‎04-19-2023

Thank you @efavreau!

It might be the case since we recently reduced the instance type of several of our nodes from c5.9xlarge to c5.4xlarge (due to very low CPU usage and "over-provisioned" flag), and there is also the IOWait warning. So I'm wondering what you upgraded to solve the issue.

VatsalJagani · ‎04-18-2023

@Sharzi - This is usually caused by so many scheduled reports/alerts executing at the same time. Try working on the cron to make sure alerts/reports execute at the different minutes of the hour.

ex. few alerts to execute at 1st minute of an hour, more at 3rd minute of an hour, and so on rather than running all reports at 0th minute of an hour.

Solution in case there are still skipped report execution:

In that case you can use backfill mechanism, described here - https://docs.splunk.com/Documentation/Splunk/9.0.4/Knowledge/Managesummaryindexgapsandoverlaps

I hope this helps, kindly upvote if it does!!!

Sharzi · ‎04-19-2023

Hi @VatsalJagani ,

Thank you for your reply.

We spread out the schedule, but we still have the same issue. Since the searches are running with no error, would you think the backfill script solves this issue?

Thanks,

VatsalJagani · ‎04-19-2023

This is resource issue, so backfill script is not a solution, rather patch to fix issue created by resource limitation.

So, you definetly have to fix the resource issue.

I hope this helps!!!

Why doesn't summary index data get written sometimes?

summary indexing

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

Why doesn't summary index data get written sometimes?

summary indexing

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard