I ran a search recently that took a couple hours to run. The number of results was pretty low - only a few thousand, but the time frame for the search was very long. Since I did not want to wait for the search to complete, I sent the job to the background. After I got the notification that the job was complete, I wanted to export the results to a CSV. But when I went to do this, it seems that the entire search gets executed again. This results in a browser timeout and I am unable to get the export.
I don't recall having this problem before (I recently upgraded from 6.3.3 to 6.5.1), so I opened a support case on this. Support is telling me to pipe the search to the 'outputcsv' command. If I have to pipe the output to the outputcsv to export results, that's kind of a pain because I won't always know when it will be required because I won't always know how long the search will take. And in order for me to get an estimate and get the results, I'll have to run the search twice anyway.
This doesn't feel right to me. Is my inability to obtain an export of results when the query takes a long time to run normal Splunk behavior?
Okay, one thing here... if you are trying to add a
outputcsv to the end, then it will always result in a rerun, because you have changed the search.
On the other hand, if the job is completed, and you go to the jobs list, inspect the job and find the job SID, you can use
loadjob to start a new search where the old search ended off. Also, if your email was set to send a link to the search results, the SID is at the end of the link, so you can copy the SID and just use
so I talked with @cpride_splunk about this, and he let me know that if the UI determines that the results were truncated, that it will rerun the results when exporting.
If exporting is going to be done, a RESTAPI can be done.
For slightly more detail -- what the button in the UI does (as of 6.5) is that it checks if the events are truncated via a couple potential reasons that might be the case. (e.g. Your search is an event search that was using remote timeliner meaning not all events were set to the search head.)
Then if the results were truncated it re-runs the search using the original parameters for the job but the endpoint "services/search/jobs/export". If the results were not truncated it uses "services/search/jobs//(events|results)/export" on the REST API.
If you have a job that was truncated but want all currently retained data without re-running the search you can use "services/search/jobs//(events|results)/export" on the REST API to get that data. Or if you know you wanted export from the beginning you can just skip the UI run and use the REST API "services/search/jobs/export" from the get-go if you are comfortable using the rest API.
There is also support in some of the SDK packages on dev.splunk.com for interacting with the REST API. ( http://docs.splunk.com/DocumentationStatic/PythonSDK/1.6.2/client.html#splunklib.client.Jobs has the reference for interacting with the jobs endpoint from the PythonSDK for Splunk)
This doesn't strike me as an acceptable solution. A customer has to write automation using the API in order to get (what appears to be) a simple result set? The events were sitting there right in front of me - I could copy/paste from the web page without having to write any automation. And I don't feel it is a good assumption that customers have the ability to write such API automation - not all customers will be that sophisticated.
I still have an open case on this and am awaiting a response.
There are instances where we've never been able to export results. We've even broken the search query up by month and other slice/dice (inefficient) approaches.. Still nothing. It's either work with what's on the screen before the shift ends or nada...
I haven't noticed this happening to me in 6.5.1, so now I'll have to keep my eyes open.
this is what the doc has to say about it:
"If your search returns a large number of results, you can access all of the results in the Search app. However, the full set of results might not be stored with the search job artifact. When you export search results, the export process is based on the search job artifact, not the results in the Search app. If the artifact does not contain the full set of results, a message appears at the bottom of the Export Results dialog box to tell you that the search will be rerun by the Splunk software before the results are exported."
Yes, I found that paragraph too. But in my case, the /number/ of results is not high which is what the documentation specifically mentions. When I export, /there is/ a link stating that the search will be executed again. But I don't understand why, or whether or not I can somehow force the results to be included with the artifact to avoid having to run the search twice. The documentation says nothing about this.