An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor

|
| Figure 1: Performance of the MRRR algorithm for random input matrices. |
Abstract:
We present an implementation of the Algorithm of Multiple Relatively
Robust Representations (MRRR) for the symmetric tridiagonal eigenvalue
problem on a data-parallel coprocessor using the CUDA programming
environment. Our results demonstrate that the algorithm maps well onto
a data-parallel architecture and we achieve up to 50-fold speedups
over LAPACK’s MRRR implementation. Our accuracy lacks currently behind
those of LAPACK but we believe that these problems can be overcome in
the near future without significantly affecting performance.
Technical Report
Christian Lessig, An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor, Technical Report, University of Toronto, 2008. (pdf, bibtex)
Source Code
Source code
of our implementation released under the GPL3 license. Please refer to the README file for details. Note that currently only CUDA 1.1 and Linux are supported (although the changes are minor to othercome these limitations).
Precompiled CLAPACK binaries for Linux and Windows.
|