Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project
Bosilca, G., Bouteiller, A., Danalis, A, Faverge, M., Haidar, H., Herault, T., Kurzak, J., Langou, J., Lemarinier, P., Ltaief, H., Luszczek, P., YarKhan, A., Dongarra, J.
We present DPLASMA, a new project related to PLASMA, that operates in the distributed memory regime. It uses a new generic distributed Direct Acyclic Graph engine for high performance computing (DAGuE). Our work also takes advantage of some of the features of DAGuE, such as DAG representation that is independent of problem-size, overlapping of communication and computation, task prioritization, architecture-aware scheduling and management of micro-tasks on distributed architectures that feature heterogeneous many-core nodes. The originality of this engine is that it is capable of translating a sequential nested-loop code into a concise and synthetic format which it can be interpret and then execute in a distributed environment. We consider three common dense linear algebra algorithms, namely: Cholesky, LU and QR factorizations, to investigate their data driven expression and execution in a distributed system. We demonstrate from our preliminary results that our DAG-based approach has the potential to bridge the gap between the peak and the achieved performance that is characteristic in the stateof-the-art distributed numerical softwares on current and emerging architectures.
Published 2010-09-15 04:00:00 as ut-cs-10-660 (ID:62)