Refine your search:     
Report No.
 - 

Porting a state-of-the-art communication avoiding Krylov subspace solver on P100 GPUs

Ali, Y.*; Ina, Takuya*; Onodera, Naoyuki   ; Idomura, Yasuhiro   

Krylov subspace solvers for the pressure Poisson equation occupy $$sim 90%$$ of the total computing cost in extreme scale multi-phase CFD simulation. To accelerate the Poisson solver, we port a Chebyshev Basis communication-avoiding Conjugate Gradient (CBCG) solver with block Jacobi (BJ) preconditioning on P100 GPUs. The CBCG solver consists of BJ preconditioning, Sparse Matrix Vector product (SpMV), and Tall-Skinny matrix operations. We re-design the BJ-preconditioner for thread-block parallelization and efficient coalescing data load, and apply batched gemm to the Tall-Skinny matrix operations. By these optimization, all main kernels achieved $$sim 90%$$ of the theoretical performance based on roofline estimation, and an order of magnitude speedup of the single node performance was obtained against CPU nodes.

Accesses

:

- Accesses

InCites™

:

Altmetrics

:

[CLARIVATE ANALYTICS], [WEB OF SCIENCE], [HIGHLY CITED PAPER & CUP LOGO] and [HOT PAPER & FIRE LOGO] are trademarks of Clarivate Analytics, and/or its affiliated company or companies, and used herein by permission and/or license.