Splunk Search

How do I fix this Search lag error?

tpchi
New Member

Hi team,
There is following errors with my Splunk healtch check.
"The number of extremely lagged searches (1) over the last hour exceeded the red threshold (1) on this Splunk instance"
Do you have any idea what I should do ?

Labels (1)
0 Karma

DalJeanis
Legend

Okay, so what it is telling you is that you are having very many slow searches.

You need to figure out WHAT those searches are, and WHY they are slow.

You can start by trying to figure out which jobs are taking up lots of time

|rest /services/search/jobs | sort 0 - performance_command_addinfo_duration_secs

Then you can start looking at the biggest time wasters, and seeing what might be making them slow. There are dozens of things we could look at, from the very simple to the very complex.

First, get rid of all realtime searches. They are almost never really needed. Use near-real-time searches that run every minute or two instead, or use data models, or any of a number of other strategies that save CPU cycles.

https://answers.splunk.com/answers/734767/why-are-realtime-searches-disliked-in-the-splunk-w.html

Second, make sure all saved searches and scheduled searches are using smart mode.

https://answers.splunk.com/answers/542718/splunk-searches-slow.html

Third, make sure that dashboards aren't spamming your instance. they shouldn't be recalculating very often, and if many people are using the same dash, then it should be based on loading a periodic saved search, rather than running the redundant search themselves.

https://answers.splunk.com/answers/432254/is-it-better-to-use-loadjob-or-scheduled-saved-sea.html

Fourth, check individual searches that take a long time and see if they can be corrected not to waste resources. Anything with map or transaction or more than one join is probably a good candidate for a refactor. Take each kind of search that is really slow to run, and research here on answers if there is a better way. After you've researched, if you can't figure it out, write a single question for one problem search, and see what we can help you with.

woodcock
Esteemed Legend

The search above is slightly wrong.  Try this:

| rest /services/search/jobs splunk_server=local
| stats count avg(performance.command.addinfo.duration_secs) AS avg max(performance.command.addinfo.duration_secs) AS max
BY search
| sort 0 - max - avg

R15
Path Finder

Neither are working for me. Their search gives an unwieldy table with 100+ columns, yours has only blanks for avg and max. 
Splunk 9.1.2

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...