All Publications
-
Distributed-Memory Task Execution and Dependence Tracking within DAGuE and the DPLASMA Project
, September 2010, ut-cs-10-660.pdf -
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming
, September 2010, ut-cs-10-656.pdf -
An Improved MAGMA GEMM for Fermi GPUs
, July 2010, ut-cs-10-655.pdf -
International Exascale Software Project Roadmap v1.0
, June 2010, ut-cs-10-654.pdf -
Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems
, April 2010, ut-cs-10-653.pdf -
QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment
, January 2010, ut-cs-10-651.pdf -
Performance evaluation for petascale quantum simulation tools
, December 2009, ut-cs-09-650.pdf -
Accelerating GPU Kernels for Dense Linear Algebra
, December 2009, ut-cs-09-648.pdf