Skip to content Skip to main navigation Report an accessibility issue

EECS Publication

Scheduling Linear Algebra Operations on Multicore Processors

Jakub Kurzak, Hatem Ltaief, Jack Dongarra, and Rosa M. Badia

State-of-the-art dense linear algebra software, such as the LAPACK and ScaLAPACK libraries, suffer performance losses on multi-core processors due to their inability to fully exploit thread-level parallelism. At the same time the coarse-grain dataflow model gains popularity as a paradigm for programming multi-core architectures. This work looks at implementing classic dense linear algebra workloads, Cholesky factorization, QR factorization and LU factorization, using dynamic data-driven execution. Two emerging approaches to implementing coarse-grain dataflow are examined, the model of nested parallelism, represented by the Cilk framework, and the model of parallelism expressed through an arbitrary Direct Acyclic Graph, represented by the SMP Superscalar framework. Performance and coding effort are analyzed and compared agains code manually parallelized at the thread level.

Published  2009-02-06 05:00:00  as  ut-cs-09-636 (ID:68)

ut-cs-09-636.pdf

« Back to Listing