Solved: Too many search jobs found in the dispatch directo...

xsstest · ‎07-26-2017

The cause of the matter is here:
https://answers.splunk.com/answers/556169/how-to-bring-together-the-alert-results-together.html

Yes, I have a lot of alert about the attack.I want to bring together the results.Finally, the IP of all attackers is displayed.So I chose the summary index

the So the question is coming (These questions are all about summary index)

1、My search head received the following message. Because the summary index is triggered in real-time And my summary index is too much, So will there be such a prompt?Does this mean that I can not create too many summary indexes?

Too manay search jbos found in the dispatch directory(found=5064,warning level=5000),This could negatively impact Splunk's performance consider removing some of the old search jobs

2、When an alert has been triggered and and a result is produced.The result should then be written to the summary index.But the problem is coming.. Why do I have 5 duplicate results in my summary index?Is it because of the SHC problem?

I have 3 search heads, and the three search header summary indexes are forwarded to the indexer cluster.

DalJeanis · ‎07-26-2017

First, when you have duplicates of alerts, it is usually because you are searching across a time range longer than the frequency you run the alert on. If you run a job every 5 minutes that tests a 30 minute period, then it will alert 5-6 times on the same data. Likewise, if you are doing a realtime search, it could trigger each time a new event matches the pattern, even if it produces the same alert record (for instance if you were alerting on a series of events using the _time of the first record after X records were detected.) Avoid that issue either by setting the throttle options, or by adjusting the time range for the alert and the frequency in tandem with each other.

Second, unless there is a REALLY GOOD REASON that you can't wait one minute to receive an alert, you shouldn't set alerts on real-time searches. Consider the actual real-life work environment. The frequency should be driven based on WHO REALLY NEEDS TO KNOW and HOW FAST WILL THEY NEED TO RESPOND.

How fast is someone really going to respond? How fast does someone really need to know? If there is not a person hired to sit at a console to respond to an emergency NOW if the alert goes off, then your business use case is probably for a periodic search, every 5m or 15m or 1h or whatever. If it is information-only for planning purposes, then running the job every HOUR is too frequent.

Typically, default to 5m if someone needs to know what is happening now. Upgrade to 3m or 2m or 1m only if there's a strong need for minute-by-minute visibility and someone is actually working that KPI as their major responsibility. Or, if there's money or lives involved and data delayed by 3m could actually hurt someone.

View solution in original post

DalJeanis · ‎07-26-2017

First, when you have duplicates of alerts, it is usually because you are searching across a time range longer than the frequency you run the alert on. If you run a job every 5 minutes that tests a 30 minute period, then it will alert 5-6 times on the same data. Likewise, if you are doing a realtime search, it could trigger each time a new event matches the pattern, even if it produces the same alert record (for instance if you were alerting on a series of events using the _time of the first record after X records were detected.) Avoid that issue either by setting the throttle options, or by adjusting the time range for the alert and the frequency in tandem with each other.

Second, unless there is a REALLY GOOD REASON that you can't wait one minute to receive an alert, you shouldn't set alerts on real-time searches. Consider the actual real-life work environment. The frequency should be driven based on WHO REALLY NEEDS TO KNOW and HOW FAST WILL THEY NEED TO RESPOND.

How fast is someone really going to respond? How fast does someone really need to know? If there is not a person hired to sit at a console to respond to an emergency NOW if the alert goes off, then your business use case is probably for a periodic search, every 5m or 15m or 1h or whatever. If it is information-only for planning purposes, then running the job every HOUR is too frequent.

Typically, default to 5m if someone needs to know what is happening now. Upgrade to 3m or 2m or 1m only if there's a strong need for minute-by-minute visibility and someone is actually working that KPI as their major responsibility. Or, if there's money or lives involved and data delayed by 3m could actually hurt someone.

nick405060 · ‎08-10-2018

THANK YOU. Real-time alerts spammed our dispatch folder and ended up breaking the entire Splunk interface. Cleared /var/run/splunk/dispatch and modded the real-time alerts and boom, fixed.

If anyone doesn't know cron schedules, setting to "* * * * *" should fix this problem. It's alerting every minute instead of real-time.

DalJeanis · ‎08-10-2018

@nick405060 - yes, you can default to every minute if that's appropriate to the use case. Ideally, though, you want to spread them out more, so that some are running every 3 minutes on (0,3...57), some on (1,4...58) and some on (2,5,...59). This avoids having a half zillion skipped searches affecting your health stats, just because they are all trying to run at the same moment every 1 minute. Also, where accurate for the particular use case and KPI, you really want to reduce the frequency on the less urgent/critical KPIs to leave bandwidth for the most critical ones.

woodcock · ‎07-26-2017

The problem is almost certainly related to your use of SHC. Why are you in a SHC? How many active users do you have logging into Splunk and what is the maximum expected simultaneous number of users?

Too many search jobs found in the dispatch directory(found=5064,warning level=5000),This could negatively impact Splunk's performance consider removing some of the old search jobs

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases

Are you a member of the Splunk Community?

Too many search jobs found in the dispatch directory(found=5064,warning level=5000),This could negatively impact Splunk's performance consider removing some of the old search jobs

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases