Splunk Search

Average execution lag increases over time

drussell88
Explorer

I am having an issue with the average execution lag increasing over a period of 24 hours. This is pushing off the time that my scheduled jobs are set to run. I have around 450 saved searches that run anywhere from every 7 minutes to every 4 hours. I have to restart the search head to alleviate the issue. The search head is a windows server with 4 Intel Xeon E7 processors...E7 has 8 cores giving the machine 32 cpu's. I do not see any issues with the cpu usage or memory. I have plenty of that. I was reading about making some modification to max_searches_per_cpu (mine is currently 4) and/or changing the max_searches_perc to a value greater than 25 in limits.conf. I am not sure exactly what I should do in my case. My limits.conf is access from default. I do not have a version in local currently. I also read that some jobs might be waiting on others to finish. How do I determine if this is happening?

Tags (1)
0 Karma

yannK
Splunk Employee
Splunk Employee

I have around 450 saved searches that run anywhere from every 7 minutes to every 4 hours

This statement is likely the cause, if your searches overlap you will quickly reach the maximum number of concurrent searched for : the users quota, and for the system limits.

Install the SOS app and look at the scheduler dashboard, you will see when the scheduled searched starts to be skipped. And find the worse searches.

To resolve it, optimize your searches duration and their spread over the time.

0 Karma

drussell88
Explorer

I have one search head and one indexer available to me. They seem like the machines can handle the load. I was thinking it is a configuration issue. I have been looking for DEBUG settings and exclusion of virus scan.

0 Karma

yannK
Splunk Employee
Splunk Employee

if the core issue is slow searches, you need to consider the scaling of your cluster. See if loabalancing your data over more indexers will improve overall search speed.

0 Karma

drussell88
Explorer

There is a period of time where there i a large number of skipped searches, but it seems like all of them.

0 Karma

drussell88
Explorer

I do have the SOS app. The average running searches is below 5 over a 24 hour period, but the lag time creeps way up. There are period of time where there a large number of searches, but it seems like it is all of them.

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...