Refine your search:     
Report No.

Overlapping communications in gyrokinetic codes on accelerator-based platforms

Asahi, Yuichi*; Latu, G.*; Bigot, J.*; Maeyama, Shinya*; Grandgirard, V.*; Idomura, Yasuhiro

Two five-dimensional gyrokinetic codes GYSELA and GKV were ported to the modern accelerators, Xeon Phi KNL and Tesla P100 GPU. Serial computing kernels of GYSELA on KNL and GKV on P100 GPU were respectively 1.3x and 7.4x faster than those on a single Skylake processor. Scaling tests of GYSELA and GKV were respectively performed from 16 to 512 KNLs and from 32 to 256 P100 GPUs, and data transpose communications in semi-Lagrangian kernels in GYSELA and in convolution kernels in GKV were found to be main bottlenecks, respectively. In order to mitigate the communication costs, pipeline-based and task-based communication overlapping were implemented in these codes.



- Accesses




Category:Computer Science, Software Engineering



[CLARIVATE ANALYTICS], [WEB OF SCIENCE], [HIGHLY CITED PAPER & CUP LOGO] and [HOT PAPER & FIRE LOGO] are trademarks of Clarivate Analytics, and/or its affiliated company or companies, and used herein by permission and/or license.