Splunk Search

Too many search jobs found in the dispatch directory(found=5064,warning level=5000),This could negatively impact Splunk's performance consider removing some of the old search jobs

xsstest
Communicator

The cause of the matter is here:
https://answers.splunk.com/answers/556169/how-to-bring-together-the-alert-results-together.html

Yes, I have a lot of alert about the attack.I want to bring together the results.Finally, the IP of all attackers is displayed.So I chose the summary index

the So the question is coming (These questions are all about summary index)

1、My search head received the following message. Because the summary index is triggered in real-time And my summary index is too much, So will there be such a prompt?Does this mean that I can not create too many summary indexes?

Too manay search jbos found in the dispatch directory(found=5064,warning level=5000),This could negatively impact Splunk's performance consider removing some of the old search jobs

2、When an alert has been triggered and and a result is produced.The result should then be written to the summary index.But the problem is coming.. Why do I have 5 duplicate results in my summary index?Is it because of the SHC problem?

I have 3 search heads, and the three search header summary indexes are forwarded to the indexer cluster.

0 Karma
1 Solution

DalJeanis
Legend

First, when you have duplicates of alerts, it is usually because you are searching across a time range longer than the frequency you run the alert on. If you run a job every 5 minutes that tests a 30 minute period, then it will alert 5-6 times on the same data. Likewise, if you are doing a realtime search, it could trigger each time a new event matches the pattern, even if it produces the same alert record (for instance if you were alerting on a series of events using the _time of the first record after X records were detected.) Avoid that issue either by setting the throttle options, or by adjusting the time range for the alert and the frequency in tandem with each other.

Second, unless there is a REALLY GOOD REASON that you can't wait one minute to receive an alert, you shouldn't set alerts on real-time searches. Consider the actual real-life work environment. The frequency should be driven based on WHO REALLY NEEDS TO KNOW and HOW FAST WILL THEY NEED TO RESPOND.

How fast is someone really going to respond? How fast does someone really need to know? If there is not a person hired to sit at a console to respond to an emergency NOW if the alert goes off, then your business use case is probably for a periodic search, every 5m or 15m or 1h or whatever. If it is information-only for planning purposes, then running the job every HOUR is too frequent.

Typically, default to 5m if someone needs to know what is happening now. Upgrade to 3m or 2m or 1m only if there's a strong need for minute-by-minute visibility and someone is actually working that KPI as their major responsibility. Or, if there's money or lives involved and data delayed by 3m could actually hurt someone.

View solution in original post

DalJeanis
Legend

First, when you have duplicates of alerts, it is usually because you are searching across a time range longer than the frequency you run the alert on. If you run a job every 5 minutes that tests a 30 minute period, then it will alert 5-6 times on the same data. Likewise, if you are doing a realtime search, it could trigger each time a new event matches the pattern, even if it produces the same alert record (for instance if you were alerting on a series of events using the _time of the first record after X records were detected.) Avoid that issue either by setting the throttle options, or by adjusting the time range for the alert and the frequency in tandem with each other.

Second, unless there is a REALLY GOOD REASON that you can't wait one minute to receive an alert, you shouldn't set alerts on real-time searches. Consider the actual real-life work environment. The frequency should be driven based on WHO REALLY NEEDS TO KNOW and HOW FAST WILL THEY NEED TO RESPOND.

How fast is someone really going to respond? How fast does someone really need to know? If there is not a person hired to sit at a console to respond to an emergency NOW if the alert goes off, then your business use case is probably for a periodic search, every 5m or 15m or 1h or whatever. If it is information-only for planning purposes, then running the job every HOUR is too frequent.

Typically, default to 5m if someone needs to know what is happening now. Upgrade to 3m or 2m or 1m only if there's a strong need for minute-by-minute visibility and someone is actually working that KPI as their major responsibility. Or, if there's money or lives involved and data delayed by 3m could actually hurt someone.

nick405060
Motivator

THANK YOU. Real-time alerts spammed our dispatch folder and ended up breaking the entire Splunk interface. Cleared /var/run/splunk/dispatch and modded the real-time alerts and boom, fixed.

If anyone doesn't know cron schedules, setting to "* * * * *" should fix this problem. It's alerting every minute instead of real-time.

DalJeanis
Legend

@nick405060 - yes, you can default to every minute if that's appropriate to the use case. Ideally, though, you want to spread them out more, so that some are running every 3 minutes on (0,3...57), some on (1,4...58) and some on (2,5,...59). This avoids having a half zillion skipped searches affecting your health stats, just because they are all trying to run at the same moment every 1 minute. Also, where accurate for the particular use case and KPI, you really want to reduce the frequency on the less urgent/critical KPIs to leave bandwidth for the most critical ones.

0 Karma

woodcock
Esteemed Legend

The problem is almost certainly related to your use of SHC. Why are you in a SHC? How many active users do you have logging into Splunk and what is the maximum expected simultaneous number of users?

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...