Refine your search:     
Report No.
 - 
Search Results: Records 1-9 displayed on this page of 9
  • 1

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Journal Articles

Acceleration of fusion plasma turbulence simulation on Fugaku and Summit

Idomura, Yasuhiro; Ina, Takuya*; Ali, Y.*; Imamura, Toshiyuki*

Dai-34-Kai Suchi Ryutai Rikigaku Shimpojiumu Koen Rombunshu (Internet), 6 Pages, 2020/12

A new communication avoiding (CA) Krylov solver with a FP16 (half precision) preconditioner is developed for a semi-implicit finite difference solver in the Gyrokinetic Toroidal 5D full-f Eulerian code GT5D. In the solver, the bottleneck of global collective communication is resolved using a CA-Krylov subspace method, and halo data communication is reduced by the FP16 preconditioner, which improves the convergence property. The FP16 preconditioner is designed based on the physics properties of the operator and is implemented using the new support for FP16 SIMD operations on A64FX. The solver is ported also on GPUs, and the performance of ITER size simulations with $$sim 0.1$$ trillion grids is measured on Fugaku (A64FX) and Summit (V100). The new solver accelerates GT5D by $$2 sim3times$$ from the conventional non-CA solver, and excellent strong scaling is obtained up to 5,760 CPUs/GPUs both on Fugaku and Summit.

Journal Articles

Application of a preconditioned Chebyshev basis communication-avoiding conjugate gradient method to a multiphase thermal-hydraulic CFD code

Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu; Imamura, Toshiyuki*

Lecture Notes in Computer Science 10776, p.257 - 273, 2018/00

 Times Cited Count:2 Percentile:49.74(Computer Science, Artificial Intelligence)

A preconditioned Chebyshev basis communication-avoiding conjugate gradient method (P-CBCG) is applied to the pressure Poisson equation in a multiphase thermal-hydraulic CFD code JUPITER, and its computational performance and convergence properties are compared against a preconditioned conjugate gradient (P-CG) method and a preconditioned communication-avoiding conjugate gradient (P-CACG) method on the Oakforest-PACS, which consists of 8,208 KNLs. The P-CBCG method reduces the number of collective communications with keeping the robustness of convergence properties. Compared with the P-CACG method, an order of magnitude larger communication-avoiding steps are enabled by the improved robustness. It is shown that the P-CBCG method is $$1.38times$$ and $$1.17times$$ faster than the P-CG and P-CACG methods at 2,000 processors, respectively.

Oral presentation

Development of computing technologies towards exascale fusion plasma simulations

Idomura, Yasuhiro

no journal, , 

This talk reviews exascale computing technologies in fusion plasma simulations developed under the Post-K priority issue. Burning plasmas in ITER consists of multi-species ions, and their spatio-temporal scales are more than an order of magnitude larger than existing devices. Therefore, burning plasma simulations in ITER require exascale computing. To this end, we have developed novel computing technologies, which enables highly efficient computation on latest many core processors and reduces the inter-node communication, in the five dimensional fusion plasma turbulence code GT5D, and their performances were demonstrated on the Oakforest-PACS, which consists of 8,208 XeonPhi7250 (KNL) processors.

Oral presentation

Development of computing technologies for extreme scale CFD simulations on many core platforms

Idomura, Yasuhiro

no journal, , 

This talk reviews computing technologies developed for extreme scale nuclear CFD simulations on latest many core computing platforms. At JAEA, there are needs for extreme scale CFD simulations for analyzing critical issues such as melt relocation behavior of nuclear reactors at severe accidents and environmental dynamics of radioactive substances. Although the latest many core platforms offer promising solutions for such high computing needs, accelerated computation reveals severe bottlenecks of inter-node communication and data I/O. To resolve these issues, we have developed novel communication-avoiding matrix solvers and an In-Situ visualization system for the three dimensional multi-phase and multi-component thermal hydraulic core, JUPITER, and their performances were demonstrated in on the Oakforest-PACS, which consists of 8,208 XeonPhi7250 (KNL) processors.

Oral presentation

Development of communication-avoiding matrix solvers for extreme scale CFD simulations on Oakforest-PACS

Idomura, Yasuhiro

no journal, , 

Thanks to new technologies such as KNL and MCDRAM, Oakforest-PACS (OfP) achieved significantly high computing power and memory bandwidths against the conventional multi-core platforms, and played an important role as a prototype of exascale supercomputers. We developed extreme scale nuclear CFD simulations on OfP, where an important issue was to resolve communication bottlenecks revealed by accelerated computation. This issue was resolved by developing communication-avoiding (CA) matrix solvers based on CA Krylov subspace methods and CA multigrid methods, and high performance CFD simulations were enabled by using the full system size on OfP. In this talk, we review CA matrix solvers developed for the five dimensional plasma simulation code GT5D and the three dimensional multi-phase multi-component thermal-hydraulic code JUPITER.

Oral presentation

Performance evaluation of a modified communication-avoiding generalized minimal residual method on many core platforms

Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu; Matsumoto, Kazuya*; Asahi, Yuichi*; Imamura, Toshiyuki*

no journal, , 

We propose a modified communication-avoiding generalized minimal residual (CA-GMRES) method, which reduces both computation and memory access by 30% with keeping the same CA property as the original CA-GMRES method. These numerical properties, less communication and computation with higher arithmetic intensity, are promising features for future exascale machines with limited memory and network bandwidths. The modified CA-GMRES method is applied to a large scale non-symmetric matrix in an implicit solver of the gyrokinetic toroidal five dimensional Eulerian code GT5D, and its performance is estimated on the Oakforest-PACS (KNL). The numerical experiment shows that compared with the generalized conjugate residual method, computing kernels are accelerated by 1.5x, and the cost of data reduction communication is reduced from 12.5% to 1% of the total cost at 1,280 nodes.

Oral presentation

Performance property of preconditioned Chebyshev basis CG solver for multiphase CFD simulations

Mayumi, Akie; Idomura, Yasuhiro; Ina, Takuya*; Yamada, Susumu; Imamura, Toshiyuki*

no journal, , 

To improve the convergence property of the communication avoiding conjugate gradient (CA-CG) method is needed for applying it to ill conditioned problems such as the pressure Poisson equation in the multiphase CFD code JUPITER. In the CA-CG method, one can avoid more communication by increasing the number of CA steps. However, this makes the CA-CG method less robust against numerical errors. To resolve this problem, we apply the Chebyshev basis CG (CBCG) method to JUPITER.

Oral presentation

Exa-scale computing techniques in gyrokinetic codes

Idomura, Yasuhiro

no journal, , 

A communication-avoiding generalized minimal residual (CA-GMRES) method is applied to the gyrokinetic toroidal five dimensional Eulerian code GT5D, and its performance is compared against the original code with a generalized conjugate residual (GCR) method on the Oakforest-PACS (KNL). The CA-GMRES method has less memory access and collective communications than the GCR method, and thus, is suitable for future Exa-scale architectures with limited memory and network bandwidths. It is shown that compared with the original GCR version, the CA-GMRES version is accelerated by 1.32x, and the cost of data reduction communication is reduced from ~13% to ~1% of the total cost at 1,280 nodes.

Oral presentation

Optimization of fusion plasma simulations on Fugaku

Idomura, Yasuhiro

no journal, , 

A gyrokinetic toroidal 5D Eulerien code GT5D resolves global torus plasma using 5D grids, and core plasma simulations of ITER require exascale computing on Fugaku. To this end, we developed a new communication avoiding (CA) Krylov solver with FP16 preconditioning for implicit finite difference computation, which occupies more than 80% of the total computing cost in GT5D. In this solver, a bottleneck of global collective communication is resolved using a CA Krylov subspace method. In addition, halo communication is reduced by improving the convergence property with FP16 preconditioning. The FP16 preconditioner was designed based on physics properties of the operator, and was implemented using FP16 SIMD operations. Compared with the conventional solver, the new solver improved the performance of ITER size simulations with ~100 billion grids by ~3.5x, and a good strong scaling was achieved up to 5,760 nodes.

9 (Records 1-9 displayed on this page)
  • 1