Optimizing sorting algorithms by using sorting networks (Q2628305)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Optimizing sorting algorithms by using sorting networks |
scientific article |
Statements
Optimizing sorting algorithms by using sorting networks (English)
0 references
1 June 2017
0 references
Sorting is one of the most fundamental problems in computer science since it is needed as a subroutine in many applications. This paper brings together two major branches of research on sorting. On the one hand there is the ongoing quest for the best sorting software on mainstream microprocessors -- mostly for sorting rather large inputs. On the other hand, there is a more specialized branch of research on sorting networks. These consist of elementary circuits (comparators) that take two input elements and output them in sorted order. Considerable research has been done on finding sorting networks with minimal size (number of comparators) or minimal depth (longest path) for small inputs (up to 20 elements). The connection between these two subjects is that most successful sorting algorithms break down the problem of sorting large inputs into many sorting problems for small inputs. Although these base cases constitute only an asymptotically vanishing fraction of the overall work, they constitute a surprisingly large fraction of the total work in practice. Equally surprisingly, the state of the art is to use a simple algorithm with quadratic complexity for the base cases (insertion sort). Using sorting networks for the base case is attractive since they need less comparisons than insertion sort and because no conditional branches are needed (in contrast to an alternative approach of using complex if-then-else nests which are very expensive on modern architectures that rely on high quality branch predictions). It turns out that additional algorithm engineering is needed to make this idea work well:{\parindent=0.7cm\begin{itemize}\item[--] Make sure that actually no conditional branches are made. \item[--] Exploit instruction level parallelism. \item[--] For inputs that do not fit into the register file, use networks that do not need too many memory accesses. \end{itemize}} Overall, a quicksort implementation using optimized sorting networks as base case can be up to 20\% faster than one using insertion sort.
0 references
sorting algorithms
0 references
sorting networks
0 references
instruction-level parallelism
0 references
out-of-order execution
0 references