Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Idomura, Yasuhiro
no journal, ,
The Gyrokinetic Toroidal 5D full-f Eulerian code GT5D is based on a semi-implicit finite difference scheme, in which a stiff linear 4D convection operator is subject to implicit time integration, and the implicit finite difference solver for fast kinetic electrons occupies more than 80% of the total computing cost. The implicit solver was originally developed using a Krylov subspace method, in which global collective communications and halo data communications were becoming bottlenecks on the latest accelerator based platforms. To resolve this issue, the convergence property is improved by using a new FP16 preconditioner, and an order of magnitude reduction of the number of iterations and thus, communications was achieved. A communication-avoiding (CA) solver based on the FP16 preconditioner was developed by utilizing the new support for FP16 SIMD operations on FUGAKU, and was ported also on Summit. The new CA solver showed significant speedups both on FUGAKU and SUMMIT, and its performance portability was demonstrated.
Grandgirard, V.*; Asahi, Yuichi; Bigot, J.*; Bourne, E.*; Dif-Pradalier, G.*; Donnel, P.*; Garbet, X.*; Ghendrih, P.*
no journal, ,
Core transport modelling in tokamak plasmas has now reached maturity with non-linear 5D gyrokinetic codes in the world available to address this issue. However, despite numerous successes, their predictive capabilities are still challenged, especially for optimized discharges. Bridging this gap requires extending gyrokinetic modelling in the edge and close to the material boundaries, preferably addressing edge and core transport on an equal footing. This is one of the long term challenges for the petascale code GYSELA [V. Grandgirard et al., CPC 2017 (35)]. Edge-core turbulent plasma simulations with kinetic electrons will require exascale HPC capabilities. We present here the different strategies that we are currently exploring to target the disruptive use of billions of computing cores expected in exascale-class supercomputer as OpenMP4.5 tasks for overlapping computations and MPI communications, KOKKOS for performant portability programming and code refactoring.