How do I fix this Search lag error?

tpchi · ‎04-08-2020

Hi team,
There is following errors with my Splunk healtch check.
"The number of extremely lagged searches (1) over the last hour exceeded the red threshold (1) on this Splunk instance"
Do you have any idea what I should do ?

DalJeanis · ‎04-10-2020

Okay, so what it is telling you is that you are having very many slow searches.

You need to figure out WHAT those searches are, and WHY they are slow.

You can start by trying to figure out which jobs are taking up lots of time

|rest /services/search/jobs | sort 0 - performance_command_addinfo_duration_secs

Then you can start looking at the biggest time wasters, and seeing what might be making them slow. There are dozens of things we could look at, from the very simple to the very complex.

First, get rid of all realtime searches. They are almost never really needed. Use near-real-time searches that run every minute or two instead, or use data models, or any of a number of other strategies that save CPU cycles.

https://answers.splunk.com/answers/734767/why-are-realtime-searches-disliked-in-the-splunk-w.html

Second, make sure all saved searches and scheduled searches are using smart mode.

https://answers.splunk.com/answers/542718/splunk-searches-slow.html

Third, make sure that dashboards aren't spamming your instance. they shouldn't be recalculating very often, and if many people are using the same dash, then it should be based on loading a periodic saved search, rather than running the redundant search themselves.

https://answers.splunk.com/answers/432254/is-it-better-to-use-loadjob-or-scheduled-saved-sea.html

Fourth, check individual searches that take a long time and see if they can be corrected not to waste resources. Anything with map or transaction or more than one join is probably a good candidate for a refactor. Take each kind of search that is really slow to run, and research here on answers if there is a better way. After you've researched, if you can't figure it out, write a single question for one problem search, and see what we can help you with.

woodcock · ‎07-05-2023

The search above is slightly wrong. Try this:

| rest /services/search/jobs splunk_server=local
| stats count avg(performance.command.addinfo.duration_secs) AS avg max(performance.command.addinfo.duration_secs) AS max
BY search
| sort 0 - max - avg

R15

Neither are working for me. Their search gives an unwieldy table with 100+ columns, yours has only blanks for avg and max.
Splunk 9.1.2

How do I fix this Search lag error?

other

Splunk Custom Visualizations App End of Life

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk