Solved: Why are we getting message "waiting for queued job...

a212830 · ‎09-10-2014

Hi,

One of my customers received a "waiting for queued job to start" message today, and it then took about 5 minutes for the job to run. How can I trouble-shoot this, (since I have a boat-load of people about ready to start using Splunk)...

bohrasaurabh · ‎09-17-2014

We ran into the same issue on our environment. The number of concurrent searches which can be executed is controlled by max_searches_per_cpu, which by default is set 1. Also base_max_searches is added to above number to define the max searches which can be executed at the same time.

max # searches = (value of max_searches_per_cpu * # CPUs) + base_max_searches

With SHP and based on how many users are logged on to a server load balanced by VIP and also the Dashboards they are launching you might start getting Job queued. One other major is the scheduled reports/searches/alerts you have in the system, these add to the queuing.

Most of the time queuing will be seen at 15, 30, 45 and 00 mins past the hour (like a wave) as users tend to run scheduled stuff every 5/10/15 mins. Hardest hit is at top of the hour when most of the searches run at the same time.

I would advise you start with max_searches_per_cpu to 2 in local limits.conf on the servers and go up to 4. If you start seeing the issue at value 4 then plan to add another server with same # of CPU to your SHP.

See:
http://docs.splunk.com/Documentation/Splunk/6.1.3/admin/Limitsconf

Search max_searches_per_cpu in Splunk Answers for more insight.

View solution in original post

rajiv_r · ‎11-18-2019

Even i faced the same issue today and tried varies thing but it didn't worked out. when i increases the User-level concurrent search jobs Limit and total jobs disk quota in the role through access control option , the dashboard started working fine again

andygerber · ‎04-29-2016

This error also occurs if your user has gone over disk space quota for saved searches. If that's the case, the error can be seen in the Job Inspector. Delete saved searches under Activity->Jobs to clear this problem.

mbksplunk · ‎08-10-2016

This worked too. Thanks

bohrasaurabh · ‎09-17-2014

We ran into the same issue on our environment. The number of concurrent searches which can be executed is controlled by max_searches_per_cpu, which by default is set 1. Also base_max_searches is added to above number to define the max searches which can be executed at the same time.

max # searches = (value of max_searches_per_cpu * # CPUs) + base_max_searches

With SHP and based on how many users are logged on to a server load balanced by VIP and also the Dashboards they are launching you might start getting Job queued. One other major is the scheduled reports/searches/alerts you have in the system, these add to the queuing.

Most of the time queuing will be seen at 15, 30, 45 and 00 mins past the hour (like a wave) as users tend to run scheduled stuff every 5/10/15 mins. Hardest hit is at top of the hour when most of the searches run at the same time.

I would advise you start with max_searches_per_cpu to 2 in local limits.conf on the servers and go up to 4. If you start seeing the issue at value 4 then plan to add another server with same # of CPU to your SHP.

See:
http://docs.splunk.com/Documentation/Splunk/6.1.3/admin/Limitsconf

Search max_searches_per_cpu in Splunk Answers for more insight.

scc00 · ‎04-17-2015

Thanks this worked great :).

a212830 · ‎09-19-2014

Good stuff! Thanks.

linu1988 · ‎09-13-2014

use SOS app to see which jobs are taking more time. The message suggests all of your cores are already taken and they are waiting for a free core to start the job. See the jobs in the jobs option and System Activity about users.

And in a distributed environment they also depend on Indexer how they handle the search head requests. So you might as well look into your indexer usage. Thanks

a212830 · ‎09-13-2014

64 cores per server, using SHP.

sowings · ‎09-12-2014

The jobs won't show as skipped, because that's solely for scheduled jobs.

How many CPU on the search head? The maximum number of concurrent historical searches is based upon the number of CPU on the search head.

a212830 · ‎09-12-2014

Anyone? I've looked for this message and skipped jobs, but haven't been able to find anything.

Why are we getting message "waiting for queued job to start..." and search job takes 5 minutes to run?

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Transform your security operations with Splunk Enterprise Security

Splunk Admins and App Developers | Earn a $35 gift card!