I've been experimenting with a number of different settings, but here are my current search args:
JobExportArgs searchArgs = new JobExportArgs();
searchArgs.setIndexEarliest(startDate.toString("YYYY-MM-DDThh:mm:ss.mss"));
searchArgs.setIndexLatest(endDate.toString("YYYY-MM-DDThh:mm:ss.mss"));
searchArgs.setSearchMode(JobExportArgs.SearchMode.NORMAL);
searchArgs.setOutputMode(JobExportArgs.OutputMode.XML);
searchArgs.setAutoCancel(0);
searchArgs.setAutoFinalizeEventCount(0);
searchArgs.setAutoPause(0);
I then invoke the search and parse the result more or less identically to the code sample in "To run an export search"...
MultiResultsReaderXml multiResultsReader = new MultiResultsReaderXml(service.export(search, searchArgs));
for (SearchResults searchResults : multiResultsReader) {
if (searchResults.isPreview())
log.info("Search in progress");
else
log.info("Search finalized");
for (Event event : searchResults) {
for (String k: event.keySet()) {
String s = event.get(k);
// add string to collection
}
}
}
This returns on the order of 100k-200k results, many of which are duplicates. If I paste the exact same search string into the Web UI, and set the custom time range to the exact same earliest/latest times, I get back ~4M unique results.
This is a very large search, which is why I opted for the export search (based on documentation); was that the wrong move? Am I doing something wrong in the parsing of my results? Am I missing some vital search arg? You can see I'm setting everything to 0 right now, mostly because I have no idea what might be cutting the search off short.
May have found at least a partial answer/solution... my query string ended with:
... | table entityKey
What seems to have made a difference is using "fields" instead:
... | fields entityKey | fields - _*
And then of course it turns out this line (which I added after my flailing had begun) prevents the search from ever finishing:
searchArgs.setAutoFinalizeEventCount(0);
So, fix the search string, remove that Arg, and voila: ~4M unique results.