An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor



Figure 1: Performance of the MRRR algorithm for random input matrices.


Abstract: We present an implementation of the Algorithm of Multiple Relatively Robust Representations (MRRR) for the symmetric tridiagonal eigenvalue problem on a data-parallel coprocessor using the CUDA programming environment. Our results demonstrate that the algorithm maps well onto a data-parallel architecture and we achieve up to 50-fold speedups over LAPACK’s MRRR implementation. Our accuracy lacks currently behind those of LAPACK but we believe that these problems can be overcome in the near future without significantly affecting performance.


Technical Report

Christian Lessig, An Implementation of the MRRR Algorithm on a Data-Parallel Coprocessor, Technical Report, University of Toronto, 2008. (pdf, bibtex)

Source Code

Source code
of our implementation released under the GPL3 license. Please refer to the README file for details. Note that currently only CUDA 1.1 and Linux are supported (although the changes are minor to othercome these limitations).

Precompiled CLAPACK binaries for Linux and Windows.

© lessig (at) dgp (dot) toronto (dot) edu May 2008