The partitions argument runs the reduce step (in parallel reduce processing) with multiple threads in the same search process on the same machine. Compare that with parallel reduce that runs the reduce step in parallel on multiple machines.
Parallel reduce is implemented with the redistribute command: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Redistribute
In testing, the run times for a search using
partitions=5 shows no difference as compared to
partitions=1, using this search:
| makeresults count=5000000 | streamstats count as i | eval a=floor(i/10), b=floor(i/100), c=floor(i/1000), d=floor(i/10000), e=floor(i/100000) | stats partitions=5 count by a, b, c, d, e
In this example, we are able to observe an ~9 second difference in run times:
| makeresults count=500000 | streamstats count as i | eval j=mvrange(0,10) | stats partitions=15 count by i,j
partitions=15 the search completes in 27 seconds. With
partitions=1 the search completes in 36 seconds.
The likely issue with the partitions argument as compared to the redistribute command, is the threads are competing for memory on the same machine - and memory usage is one of the significant factors that cause high-cardinality stats to perform poorly. Partitioning the memory usage across different machines, as the redistribute command does, eliminates that competition.
Well the dox say that
1 is the default value for
partitions so your search can drop it with no change. The documentation is useless on this parameter. You ask a good question.