Solved: programmatically setting search mode to fast

johncokerc3 · ‎03-25-2014

I'm writing a Splunk app, and want to make the default search mode fast (instead of smart), since I'm building a complex query that extracts the parameters of interest myself. I couldn't find how to set the search mode to 'fast' through the template or through JavaScript. Ideally I would be able to set it per search manager.

Here's an example query I'm generating to Splunk 6.0:

a_t AND host="jcoker-mac.local" AND rootReq
| stats values(a_func), values(a_pfuncs), count, sum(a_t), sum(a_self_cpu), sum(a_self_io), avg(a_t), avg(a_self), avg(a_cpu), avg(a_self_cpu), avg(a_io), avg(a_self_io), avg(a_sql), avg(a_self_sql), avg(a_kv), avg(a_self_kv) by t_type, t_action, a_st
| sort -sum(a_t), -sum(a_self), -count
| rename t_type AS "Type", t_action AS "Action", a_st AS "Status", values(a_func) AS "Function", values(a_pfuncs) AS "Parent Functions", count AS "Calls", sum(a_t) AS "Σ T. time", sum(a_self_cpu) AS "Σ S. CPU", sum(a_self_io) AS "Σ S. I/O", avg(a_t) AS "μ T. time", avg(a_self) AS "μ S. time", avg(a_cpu) AS "μ T. CPU", avg(a_self_cpu) AS "μ S. CPU", avg(a_io) AS "μ T. I/O", avg(a_self_io) AS "μ S. I/O", avg(a_sql) AS "μ T. SQL", avg(a_self_sql) AS "μ S. SQL", avg(a_kv) AS "μ T. K/V", avg(a_self_kv) AS "μ S. K/V"

So, I'm explicitly fetching individual fields, and I don't care about any other fields (nor do I care about extra metadata since that isn't going anywhere).

sideview · ‎03-25-2014

When you change the "search mode" in the Splunk UI, ultimately the effect is to send an argument called "adhoc_search_level" on the POST request that dispatches the search. The value is one of fast/smart/verbose.

"smart" does a little more client-side logic, but the main upshot is that when this argument is present on the dispatch POST, it overrides older arguments called "status_buckets" and "required_field_list", even though frequently those older arguments are also submitted. However the interaction between the three arguments may be more complicated.

Note that if your search possesses a fields command, it's possible that the fields command contributes some effect as well, but I doubt that the fields command is sufficient to completely counteract everything that might be happening if you're sending the other API args.

Anyway, lets back up a little. For every Splunk search, there's a "streaming portion", which is basically the initial search clause, plus whatever "streaming" commands (eg eval, rename, rex, where, etc) that might come after it, up to and not including the first "transforming command" (eg stats, timechart, chart, transaction, sort). If there is no transforming command then the search is a "purely streaming" search. (Conversely if the search is something like "| inputlookup" then there is no streaming portion and all these args are ignored)

for searches that have a streaming component, status_buckets is an integer that tells Splunk how many timebuckets it should keep for summary statistics about the extracted fields present in the streaming portion of the results. This last part is weird - status_buckets only concerns the space of fields that exist at the last pipe before the transforming stuff starts.

required_field_list on the other hand tells splunk how many fields in addition to any that might be specified in the search language (eg in a fields command), for which it should keep summary data in its status_buckets.

Example: you have a search over exactly 2 hours of data, and you set status_buckets to 2, and required_field_list to "username,application". Splunkd notices that application is already mentioned explicitly in the search language so the application in required_field_list is ignored. It then proceeds to keep 2 buckets of summary statistics, for all the fields referenced in the search, plus the field "username".

Short version - If you're using the Advanced XML or Sideview Utils in 6.0, or if you're using Splunk 5.X or earlier, the modules are responsible for automatically sorting out what the status_buckets and required_field_list should be. The Splunk modules do a decent job of this. The Sideview modules from Sideview Utils do perhaps a slightly better job. Either way you don't have to really worry about any of this stuff and the searches dispatched will be as optimized as they can be. If you're ever worried that they're not optimized, just use a tool like Firebug to look at the POST args and you can see for yourself, or you can see status_buckets in the job dictionary if you inspect the job.

In the new Splunk Web Framework that ships with 6.0, you may want to worry about this. But I would advise you to just make sure that status_buckets is unset or being set to 0, and required_field_list is left unset, and you'll be fine. You can set adhoc_search_level=fast as well, but on a search where status_buckets is omitted or 0, and required_field_list is omitted, I really don't think it's going to make any difference.

View solution in original post

sideview · ‎03-25-2014