Receiving the well kwown warning messages on the dispatch directory:
Too many search jobs found in the dispatch directory (found=4321, warning level=4000). This could negatively impact Splunk's performance, consider removing some of the old search jobs.
I've looked into the command line option to clean manually. Didn't bring much space, and running an hourly clean-dispatch script dito.
Having a search head and two indexers (having both the alert for >2000), I can't figure out why I get warnings on all three. Both ad hoc searches and scheduled searches seem to take space on both the search head and an indexers. Read http://blogs.splunk.com/2012/09/12/how-long-does-my-search-live-default-search-ttl/ and wondered what use the 10 minute default is good for in limits.conf, on all three splunk machines? Is there a distinction between suggested ttl you can or should use on search head and indexer(s))?
How long search artifacts should be stored on disk once completed, in seconds. The ttl is computed relative to the modtime of status.csv of the job if such file exists or the modtime of the searchjob's artifact directory. If a job is being actively viewed in the Splunk UI then the modtime of status.csv is constantly updated such that the reaper does not remove the job from underneath.Defaults to 600, which is equivalent to 10 minutes.
Since the part about better performance when removing 'some' old search jobs, I wonder what pro's and con's are to expect from using:
[search] ttl=300 remote_ttl=300
What good does keeping it in dispatch 5 or 10 minutes after completion, or even more than a minute - do similar searches after that use this as some caching mechanism (but how if there are two indexer at work)? And in case of search head with two indexers, should ttl's on all be identical, or doesn't it matter?
I believe it a tad late to respond to your question now but i will try to give my comments on it if you are still interested.
To answer your question on "What good does keeping it in dispatch 5 or 10 minutes after completion, or even more than a minute - do similar searches after that use this as some caching mechanism " - i dont think similar searches use them as a caching mechanism, what i do with them is that, it gives me a brief idea on the types of searches that are getting run and we inform the respective people to run searches as per the best practices so as not to put a lot of load on the searches.
Now to respond to your query on how to maintain or control it. There a re multiple ways:
/tmp = the directory where you want the dispatch artifacts to be copied to.
-24h@h = the age when older dispatch artifacts are moved out of dispatch
2.You can change the retention periods of your saved searches. They are controlled by the ttl or timeout parameter, though depending on how the search is scheduling, there are many places the value may be set or overridden. See the savedsearches.conf and alert_actions.conf files.
As for users, you can use roles to limit the amount of space a user uses, which indirectly should limit the number of jobs they keep around.
modify srchDiskQuota = in authorize.conf.This specifies the Maximum amount of disk space (MB) that can be taken by search jobs of a user that belongs to this role.
NOTE: Do not edit or delete any roles in $SPLUNKHOME/etc/system/default/authorize.conf. This could break your admin capabilities. Edit this file in $SPLUNKHOME/etc/system/local/.
I hope this helps. Also suggest your opinions if you have found a better approach to it.