Monitoring Splunk

Is there a way for the REST API to pause a search job so that it only retrieves X number of results first?

We have a web application that is making REST API calls to Splunk to run searches to retrieve results. We expect a lot of users to run very broad searches that will return thousands if not millions of results. As human beings, the users will probably only look at the first few pages (like how most of us use Google). These big searches not only take time to fully finish, but the completed jobs can take up many GBs in the dispatch directory.

Is there a way for the REST API to pause a search job so that it only retrieves X number of results first? Is there a way to then resume the search if the user wants more? The goal is to only fill the dispatch directory with as much data as the user is able to page through on a web app UI so that we don't get the "out of disk space" error which causes newly submitted searches to get queued.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

@emiliavanderwerf, "first off", as a Splunk Administrator for these External App Role Access in Splunk Restrict the Max Disk Quota, Max Concurrent Searches and Time Range selection.

Secondly, by default you can configure the App to pull only specific records using | head 10 or | head 100 in your default Saved Search. Only on requesting details you can run your current Saved Search with more/all rows. This approach would imply duplicating your Saved Search as Summary and Details. Give access to Details Saved Searches only to specific users.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

SplunkTrust
SplunkTrust

@emiliavanderwerf, "first off", as a Splunk Administrator for these External App Role Access in Splunk Restrict the Max Disk Quota, Max Concurrent Searches and Time Range selection.

Secondly, by default you can configure the App to pull only specific records using | head 10 or | head 100 in your default Saved Search. Only on requesting details you can run your current Saved Search with more/all rows. This approach would imply duplicating your Saved Search as Summary and Details. Give access to Details Saved Searches only to specific users.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

SplunkTrust
SplunkTrust

Check out the count option at http://docs.splunk.com/Documentation/Splunk/7.1.1/RESTREF/RESTprolog#Pagination_and_filtering_parame...

---
If this reply helps you, an upvote would be appreciated.
0 Karma

|rest /services/search/jobs/%s/results output_mode=raw count=600,000 offset=6

Combination of count and offset will help you better

0 Karma

Thanks for your help. My understanding of count is that even though only count number of entries will be returned, the search runs to completion in the background and (if it is a very broad search) will still take up a large amount of GB in the dispatch directory. Using count does not pause the search after count number of results have been retrieved & restart the search if the user requests more results using a larger number for count.

The main problem I'm trying to avoid is that broad searches take up a large amount of space in the dispatch directory. Do you have more suggestions please?

0 Karma

SplunkTrust
SplunkTrust

Tough problem. Consider the auto_cancel and auto_finalize_ec parameters when you submit the search/jobs requests.
You can also send a search/jobs/{search_id}/control/cancel request if the user doesn't request more results or maybe even between result batches.

---
If this reply helps you, an upvote would be appreciated.