Splunk Enterprise Security

Splunk Searches delayed

jadengoho
Builder

Hi All,
I would like to ask why do we encounter this notification:
Root Cause(s):

  • The percentage of high priority searches delayed (16%) over the last 24 hours is very high and exceeded the red thresholds (10%) on this Splunk instance. Total Searches that were part of this percentage=12. Total delayed Searches=2
  • The percentage of non high priority searches delayed (47%) over the last 24 hours is very high and exceeded the red thresholds (20%) on this Splunk instance. Total Searches that were part of this percentage=21. Total delayed Searches=10

May i know what are the possible issue and resolution regarding this?

0 Karma
1 Solution

aberkow
Builder

I answered a similar question generally here - https://answers.splunk.com/answers/786499/the-percentage-of-non-high-priority-searches-lagge.html#an.... The gist is that you can use the Monitoring Console (and it's inherent queries) to better diagnose specifically what your issues are.

Here's the path (assuming you're a Splunk admin on your instance): Settings (Top right) -> Monitoring Console -> Search -> Scheduler Activity: Instance, and inputting the timeframe when this occurred. Hopefully the information under "historical charts" can point you in the direction of what caused this to occur (perhaps the machine blipped, you have a misconfigured search etc), or at least narrow down the timeframe/options so you can continue debugging.

Hope this helps!

View solution in original post

jvarner
Observer

Also, make sure to check your firewall settings.

0 Karma

aberkow
Builder

I answered a similar question generally here - https://answers.splunk.com/answers/786499/the-percentage-of-non-high-priority-searches-lagge.html#an.... The gist is that you can use the Monitoring Console (and it's inherent queries) to better diagnose specifically what your issues are.

Here's the path (assuming you're a Splunk admin on your instance): Settings (Top right) -> Monitoring Console -> Search -> Scheduler Activity: Instance, and inputting the timeframe when this occurred. Hopefully the information under "historical charts" can point you in the direction of what caused this to occur (perhaps the machine blipped, you have a misconfigured search etc), or at least narrow down the timeframe/options so you can continue debugging.

Hope this helps!

jadengoho
Builder

Hi Thanks for this, I manage to identify the issue.

Resolution:
Increase the Limits.conf base on server information and Splunk transactions.

sharmajiankur
Engager

I found below on Reditt which fixed my issue:

https://www.reddit.com/r/Splunk/comments/kx2bo6/splunk_searches_skipped_error/gjaak2e?utm_source=sha...

I'm a little late to responding here but I just went thru this scenario myself. I found anytime we made any change it would cause searches to be delayed until they caught up. Before resorting to upgrading the limits.conf you need to identify what change caused Splunk to get overwhelmed. Since things are not working for you this may be a little difficult. I would recommend contacting support if you’re not comfortable with this.

Check Correlation searches and ensure they are not set to real-time search. This would consume resources nonstop.

Check Data Models and disable acceleration on any you may not be using. Verify time frame on the data models you leave accelerated. I found some apps had acceleration enabled with a long backfile range and that would take up a large amount of resources until it catches up.

I did end up working with support and received some real good clarification on CPU settings for limits.conf.

Let me first clarify the confusion:

1 CPU can have 16 cores. Splunk suggest 1 search per CPU core not per CPU. The defaults are set extremely low. I’m sure that’s something that supposed to get updated during onboarding of Splunk with some professional services.

You will have to look at what CPU’s you have and find out how many cores each one has.

base_max_searches = default 6*(let max search per cpu do the work)

max_searches_per_cpu = if 1 CPU has 16 cores make sure to leave some room for overhead processes. So 12 would be a sweet spot.

max_hist_searches = max_searches_per_cpu x number_of_cpus + base_max_searches (example: 12 x 16 + 6 = 198) do not modify this

After making these changes I have not had this occur again. I've monitored CPU utilization and it has remained stable.

Again, only do the limits.conf changes once you have figure out what is taking up so many resources. As that may just cause the server to constantly use a lot of resources. This must be configured on the search head and indexers.

Once you get this working I do recommend using the SplunkAdmins to help identify further issues. There could be a large amount of underlying issues you may not even be aware of.

rmanrique
Path Finder

What are the values you changed from the limits.conf?

0 Karma

btshivanand
Path Finder

can you please tell me which attribute consider in limits.com.we have same issue.

0 Karma

splunkcol
Builder

It would be great if someone helps many users who do not know the locations of the files and the parameters to configure within them.

In this case, where should the limits.conf configuration be applied and what are the parameters?

Tags (2)
0 Karma

btshivanand
Path Finder

Even we have same issue.Can you please tell which attributes value should increase.?

 

 

0 Karma

deepamshah
Explorer

Hi jadengoho.
Can you please explain what configuration was added/extended in limits.conf to resolve this?

Thanks

0 Karma

orezaie
Explorer

In /opt/splunk/etc/system/local/limits.conf

 

[search]

max_searches_per_cpu = 12

 

and then restart the splunk

Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...