Nuclear fusion simulation code optimization on GPU clusters

Fujita, Norihisa*; Nuga, Hideo*; Boku, Taisuke*; Idomura, Yasuhiro   

A fusion plasma turbulence simulation code GT5D is optimized for GPU clusters with multiple GPUs on a node. We get 3.37 times faster performance in maximum in function level evaluation, and 2.03 times faster performance in total than the case of CPU-only execution on the HA-PACS GPU cluster. It includes 53% performance gain with overlapping MPI communications and GPU calculations.



