real time alerts which has been configured in splunk stopped working suddenly ..when checking on schedular.log file it has log messages as "reason=realtime rtsearches limit exceeded" or "reason=real time searches pending"
@sathyasubburaj... Real-Time searches/Alerts should be decided based on your Splunk Infrastructure. Avoid them unless absolutely necessary.
In case your system can support, these settings should be located in
Splunk Settings > Access Control > Roles > (Specific Role like Admin)
User-level concurrent real-time search job limits and
Role-level concurrent real-time search job limit settings
You might also need to consider other settings like
Restrict time range,
Restrict Search terms and
Limit total job disk quota accordingly.
Hi Niketnilay ,
thanks for the response.
Currently i created the alerts using admin user/role .
Below are the settings in splunk for admin role .
User-level concurrent real-time search job limits-100
Role-level concurrent search jobs limit-200
Restrict time range-0
Restrict search terms-*
Limit total job disk quota -10000
Do I need to change the limits ?
Below is the query i have configured as alert in real time --> trigger result when number of result is greater than 1 and trigger once in one hour .
index=windows sourcetype="WMI:Service" host= Name=HM* OR Name=SD* OR Name=H&M* OR Name=Board* OR Name=Salsa* status="Stopped" OR status="Stop"|dedup Name,host | rex "Description=(?P.+).*?" |table Name ,Description,status,time,host |eval Name=upper(Name) |eval Env=case(host = "hostname", "DIT" ) |eval system=case(host = "hostname", "SDS") | convert timeformat="%H:%M:%S %Y-%m-%d" ctime(time) |Rename Name as "SERVICE NAME" status as Status _time as Time host as "SERVER" Env as "Environment" system as "SYSTEM"
I have configured 37 similar alerts like above .. does this cause issue ???
37 realtime alerts might overload your system depends on hardware specs
try this search and see if the realtime alerts are being skipped:
index=_internal sourcetype=scheduler status=skipped | table _time app user savedsearch_name reason
so the reason the alerts are not firing is the searches for the alerts are not running (skipped) most likely the reason for that is that you have many realtime searches at the same time and there are not enough cores to support it.
it is better to run a scheduled search for alerts in an interval and minimize the use of realtime searches.
so for your alerts, probably configure the searches to run lets say every 5 or 15 minutes and not real time.
this doc article can help:
sure .. will read the document .. but one more query ..if i reconfigure the 37 alerts into scheduled whether it will overload the system ???
the doc above elaborates on best practices, i will suggest to prioritize your alerts and add that factor as well when setting it up. it will take into consideration which alert has highest priority.
another important thing to pay attention to is how long the search (for the alerts) takes to complete. you dont want to schedule a search to run every minute if it takes 3 minutes to complete since it will never complete and will tie a core.