Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems
Hartwig Anzt, Stanimire Tomov, Mark Gates, Jack Dongarra, Vincent Heuveline
This paper explores the need for asynchronous iteration algorithms as smoothers in multigrid methods. The hardware target for the new algorithms is top-of-the-line, highly parallel hybrid architectures - multi-core-based systems enhanced with GPGPUs. These architectures are the most likely candidates for future highend supercomputers. To pave the road for their efficient use, we must resolve challenges related to the fact that data movement, not floating-point operations, is the bottleneck to performance. Our work is in this direction - we designed block-asynchronous multigrid smoothers that perform more flops in order to reduce synchronization, and hence data movement. We show that the extra flops are done for 'free,' while synchronization is reduced and the convergence properties of multigrid with classical smoothers like Gauss-Seidel can be preserved.
Published 2011-12-06 05:00:00 as ut-cs-11-689 (ID:51)