The SOR method on parallel computers (Q1124272)

The successive overrelaxation (SOR) method does not immediately parallelize due to data dependencies among the iteration vectors. But the crucial data dependency can be removed if, in the mth SOR-step, an intermediate vector needed in the next SOR-step is computed along with the mth iteration vector. Combining this idea with standard parallel matrix-vector multiplication it is easy to derive an SOR algorithm for dense matrices which performs much better in parallel - on a shared memory machine - than the standard SOR method. For banded matrices, however, parallelizations cannot be achieved merely by the parallel matrix-vector multiplication. Instead, parallelization can be achieved by `pipelining' p successive SOR-steps (where p is the number of parallel processors) and work with an iteration vector in the mth step of the form \[ \hat x^{(m)} = \begin{pmatrix} x_{(1)}^{(m+p-1)} \\ \\ x_{(2)}^{(m+p-2)} \\ \\ \vdots \\ \\ x_{(p)}^{(m)} \end{pmatrix} \in {\mathfrak R}^ N. \] Here, N is the dimension of the matrix. Each block \(x^{(k)}_{(i)}\) has length N/p and is, in fact, identical to the corresponding part of the kth iteration vector \(x^{(k)}\) according to standard SOR (with \(x_{(0)}=...=x_{(1-p)}=0)\). Thus, after the first p steps this pipelined algorithm will run in parallel, and \(\hat x^{(m)}\) \((m>p)\) is never worse than \(x^{(m)}\).

0 references

zbMATH Keywords

band matrices

0 references

parallel computation

0 references

successive overrelaxation

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work