Skip to content Skip to main navigation Report an accessibility issue

EECS Publication

Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited

Hatem Ltaief, Jakub Kurzak, and Jack Dongarra

The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multi-core architectures using algorithms-by-tiles. In particular, the Block Hessenberg Reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenvalue problem. Although expensive, orthogonal transformations are commonly used for this reduction because they guarantee stability, as opposed to Gaussian Elimination. Two versions of the Block Hessenberg Reduction are presented in this paper, the first one with Householder reflectors and the second one with Givens rotations. A short investigation on variants of Fast Givens Rotations is also mentioned. Furthermore, in the last Top500 list from June 2008, 98% of the fastest parallel systems in the world are based on multicores. The emerging petascale systems consisting of hundreds of thousands of cores have exacerbated the problem even more and it becomes judicious to efficiently integrate existing or new numerical linear algebra algorithms suitable for such hardwares. By exploiting the concepts of algorithms-by-tiles in the multi-core environment (i.e., high level of parallelism with fine granularity and high performance data representation combined with a dynamic data driven execution), the Block Hessenberg Reduction presented here achieves 72% of the DGEMM peak on a 12000 x 12000 matrix with 16 Intel Tigerton 2:4 GHz processors.

Published  2008-08-07 04:00:00  as  ut-cs-08-624 (ID:99)

ut-cs-08-624.pdf

« Back to Listing