Journal Articles

Acceleration of fusion plasma turbulence simulations using the mixed-precision communication-avoiding Krylov method

Idomura, Yasuhiro; Ina, Takuya*; Ali, Y.*; Imamura, Toshiyuki*

Proceedings of International Conference on High Performance Computing, Networking, Storage, and Analysis (SC 2020) (Internet), p.1318 - 1330, 2020/11

The multi-scale full-$$f$$ simulation of the next generation experimental fusion reactor ITER based on a five dimensional (5D) gyrokinetic model is one of the most computationally demanding problems in fusion science. In this work, a Gyrokinetic Toroidal 5D Eulerian code (GT5D) is accelerated by a new mixed-precision communication-avoiding (CA) Krylov method. The bottleneck of global collective communication on accelerated computing platforms is resolved using a CA Krylov method. In addition, a new FP16 preconditioner, which is designed using the new support for FP16 SIMD operations on A64FX, reduces both the number of iterations (halo data communication) and the computational cost. The performance of the proposed method for ITER size simulations with 0.1 trillion grids on 1,440 CPUs/GPUs on Fugaku and Summit shows 2.8x and 1.9x speedups respectively from the conventional non-CA Krylov method, and excellent strong scaling is obtained up to 5,760 CPUs/GPUs.

JAEA Reports

Design and mounting advanced wireless communication equipment on the crawler-type robots for tasks and on the crawler-type scouting robot

Nishiyama, Yutaka; Iwai, Masaki; Tsubaki, Hirohiko; Chiba, Yusuke; Hayasaka, Toshiro*; Ono, Hayato*; Hanyu, Toshinori*

JAEA-Technology 2020-006, 26 Pages, 2020/08


Maintenance and Operation Section for Remote Control Equipment in Naraha Center for Remote Control Technology Development is the main part of the nuclear emergency response team of JAEA deal with Act on Special Measures Concerning Nuclear Emergency Preparedness. The section needs to remodel crawler-type robots for tasks, crawler-type scouting robots, and so on. About two crawler-type robots for tasks, the section designed and mounted advanced wireless communication equipment on manipulators mounted on the two robots. The crawler part of the robot has been able to be controlled by way of the new equipment, and when it is broken down, it can be changed by way of an original equipment. And the new equipment makes a single relay robot controllable both the crawler part and the manipulator part of the robot, in case of wireless relay robots being needed. And after checking the ability and characteristic about 5 wireless communication equipment, the section chose and mounted the best equipment on one crawler-type scouting robot. This report shows design and mounting advanced wireless communication equipment on the two crawler-type robots for tasks and on the one crawler-type scouting robot.

Journal Articles

Overlapping communications in gyrokinetic codes on accelerator-based platforms

Asahi, Yuichi*; Latu, G.*; Bigot, J.*; Maeyama, Shinya*; Grandgirard, V.*; Idomura, Yasuhiro

Concurrency and Computation; Practice and Experience, 32(5), p.e5551_1 - e5551_21, 2020/03

 Times Cited Count:0 Percentile:100(Computer Science, Software Engineering)

Two five-dimensional gyrokinetic codes GYSELA and GKV were ported to the modern accelerators, Xeon Phi KNL and Tesla P100 GPU. Serial computing kernels of GYSELA on KNL and GKV on P100 GPU were respectively 1.3x and 7.4x faster than those on a single Skylake processor. Scaling tests of GYSELA and GKV were respectively performed from 16 to 512 KNLs and from 32 to 256 P100 GPUs, and data transpose communications in semi-Lagrangian kernels in GYSELA and in convolution kernels in GKV were found to be main bottlenecks, respectively. In order to mitigate the communication costs, pipeline-based and task-based communication overlapping were implemented in these codes.

Journal Articles

Implementation and performance evaluation of a communication-avoiding GMRES method for stencil-based code on GPU cluster

Matsumoto, Kazuya*; Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu

Journal of Supercomputing, 75(12), p.8115 - 8146, 2019/12

 Times Cited Count:0 Percentile:100(Computer Science, Hardware & Architecture)

A communication-avoiding generalized minimum residual method (CA-GMRES) is implemented on a hybrid CPU-GPU cluster, targeted for the performance acceleration of iterative linear system solver in the gyrokinetic toroidal five-dimensional Eulerian code GT5D. In addition to the CA-GMRES, we implement and evaluate a modified variant of CA-GMRES (M-CA-GMRES) proposed in our previous study to reduce the amount of floating-point calculations. This study demonstrates that beneficial features of the CA-GMRES are in its minimum number of collective communications and its highly efficient calculations based on dense matrix-matrix operations. The performance evaluation is conducted on the Reedbush-L GPU cluster, which contains four NVIDIA Tesla P100 GPUs per compute node. The evaluation results show that the M-CA-GMRES is 1.09x, 1.22x and 1.50x faster than the CA-GMRES, the generalized conjugate residual method (GCR), and the GMRES, respectively, when 64 GPUs are used.

Journal Articles

GPU acceleration of communication avoiding Chebyshev basis conjugate gradient solver for multiphase CFD simulations

Ali, Y.*; Onodera, Naoyuki; Idomura, Yasuhiro; Ina, Takuya*; Imamura, Toshiyuki*

Proceedings of 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA 2019), p.1 - 8, 2019/11

 Times Cited Count:5 Percentile:0.54

Iterative methods for solving large linear systems are common parts of computational fluid dynamics (CFD) codes. The Preconditioned Conjugate Gradient (P-CG) method is one of the most widely used iterative methods. However, in the P-CG method, global collective communication is a crucial bottleneck especially on accelerated computing platforms. To resolve this issue, communication avoiding (CA) variants of the P-CG method are becoming increasingly important. In this paper, the P-CG and Preconditioned Chebyshev Basis CA CG (P-CBCG) solvers in the multiphase CFD code JUPITER are ported to the latest V100 GPUs. All GPU kernels are highly optimized to achieve about 90% of the roofline performance, the block Jacobi preconditioner is re-designed to extract high computing power of GPUs, and the remaining bottleneck of halo data communication is avoided by overlapping communication and computation. The overall performance of the P-CG and P-CBCG solvers is determined by the competition between the CA properties of the global collective communication and the halo data communication, indicating an importance of the inter-node interconnect bandwidth per GPU. The developed GPU solvers are accelerated up to 2x compared with the former CPU solvers on KNLs, and excellent strong scaling is achieved up to 7,680 GPUs on the Summit.

Journal Articles

7.2.3 Towards implementation of Fukushima environmental remediation

Miyahara, Kaname; Kawase, Keiichi

Genshiryoku No Ima To Ashita, p.159 - 167, 2019/03

This manuscript overviews lessons learned from decontamination pilot projects towards implementation of regional remediation after the environmental contamination due to the Fukushima Daiichi Nuclear Power Plant Accidents.

Journal Articles

Communication avoiding multigrid preconditioned conjugate gradient method for extreme scale multiphase CFD simulations

Idomura, Yasuhiro; Ina, Takuya*; Yamashita, Susumu; Onodera, Naoyuki; Yamada, Susumu; Imamura, Toshiyuki*

Proceedings of 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA 2018) (Internet), p.17 - 24, 2018/11

 Times Cited Count:1 Percentile:38.07

A communication avoiding (CA) multigrid preconditioned conjugate gradient method (CAMGCG) is applied to the pressure Poisson equation in a multiphase CFD code JUPITER, and its computational performance and convergence property are compared against CA Krylov methods. In the JUPITER code, the CAMGCG solver has robust convergence properties regardless of the problem size, and shows both communication reduction and convergence improvement, leading to higher performance gain than CA Krylov solvers, which achieve only the former. The CAMGCG solver is applied to extreme scale multiphase CFD simulations with $$sim 90$$ billion DOFs, and it is shown that compared with a preconditioned CG solver, the number of iterations is reduced to $$sim 1/800$$, and $$sim 11.6times$$ speedup is achieved with keeping excellent strong scaling up to 8,000 nodes on the Oakforest-PACS.

Journal Articles

Challenges for enhancing Fukushima environmental resilience, 10; Dose evaluation and risk communication

Saito, Kimiaki; Takahara, Shogo; Uezu, Yasuhiro

Nippon Genshiryoku Gakkai-Shi, 60(2), p.111 - 115, 2018/02

no abstracts in English

Journal Articles

Application of a communication-avoiding generalized minimal residual method to a gyrokinetic five dimensional Eulerian code on many core platforms

Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu; Matsumoto, Kazuya*; Asahi, Yuichi*; Imamura, Toshiyuki*

Proceedings of 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA 2017), p.7_1 - 7_8, 2017/11

A communication-avoiding generalized minimal residual (CA-GMRES) method is applied to the gyrokinetic toroidal five dimensional Eulerian code GT5D, and its performance is compared against the original code with a generalized conjugate residual (GCR) method on the JAEA ICEX (Haswell), the Plasma Simulator (FX100), and the Oakforest-PACS (KNL). The CA-GMRES method has $$sim 3.8times$$ higher arithmetic intensity than the GCR method, and thus, is suitable for future Exa-scale architectures with limited memory and network bandwidths. In the performance evaluation, it is shown that compared with the GCR solver, its computing kernels are accelerated by $$1.47times sim 2.39times$$, and the cost of data reduction communication is reduced from $$5%sim 13%$$ to $$sim1%$$ of the total cost at 1,280 nodes.

Journal Articles

Challenges for enhancing Fukushima environmental resilience, 1; Status and lessons

Miyahara, Kaname; Ohara, Toshimasa*

Nippon Genshiryoku Gakkai-Shi, 59(5), p.282 - 286, 2017/05

This review highlights JAEA and NIES's challenges for enhancing Fukushima environmental resilience based on carrying out multifaceted research working with many public and private sector organizations and academia.

Journal Articles

An Overview of progress in environmental research on radioactive materials derived from the Fukushima Nuclear accident

Ohara, Toshimasa*; Miyahara, Kaname

Global Environmental Research (Internet), 20(1&2), p.3 - 13, 2017/03

Toward the environmental regeneration in Fukushima Prefecture and other areas after the Fukushima Daiichi Nuclear Power Station accidents, JAEA and NIES working with many public and private sector organizations and academia have carried out multifaceted research that will help to restore the environment of affected areas. These challenging efforts need to be further strengthened.

Journal Articles

Fukushima cleanup; Status and lessons

Miyahara, Kaname; McKinley, I. G.*; Saito, Kimiaki; Iijima, Kazuki; Hardie, S. M. L.*

Nuclear Engineering International, 60(736), p.12 - 14, 2015/11

Remediation work in Fukushima is based on a comprehensive technical knowledge base, which is translated into actions that enable the rapid return of evacuees but also provides a globally valuable resource for disaster planning and contaminated site remediation.

Journal Articles

Effective 3D data visualization in deep shaft construction

Inagaki, Daisuke*; Tsusaka, Kimikazu*; Aoyagi, Kazuhei; Nago, Makito*; Ijiri, Yuji*; Shigehiro, Michiko*

Proceedings of ITA-AITES World Tunnel Congress 2015 (WTC 2015)/41st General Assembly, 10 Pages, 2015/05

Journal Articles

Conditions for betterment of public acceptance

Sobajima, Makoto

Nippon Genshiryoku Gakkai-Shi, 46(2), p.94 - 98, 2004/02

Nuclear energy has a strong relation to a society. However, due to accidents and scandals having occurred in recent yeras, people's reliability to nuclear energy has significantly stryed and is becoming existence of a worry. Analyzing such a situation and grasping the problem contained are serious problems for people engagin in nuclear field. In order that nuclear enegry is properly used in society, communication with general public and in nuclear power plant site area are increasingly getting important as well as grasping the situation and surveying measures for ovecoming the problems. On the basis of such an analysis, various activities for betterment of public acceptance of nuclear energy by nuclear industry workers, researchers and the government are proposed.

Journal Articles

Nonlocal transport phenomena and various structure formations in plasmas; Introduction

Kishimoto, Yasuaki

Purazuma, Kaku Yugo Gakkai-Shi, 78(9), p.857 - 860, 2002/09

In order to understand the various phenomena related to the nonlocal transport and structure formation in the plasma, we reviw the topics in the field of (1)laser implosion plasma, (2)space plasma, and (3) magnetic fusion plasma, as a spatial series.

JAEA Reports

Social acceptance of technologies in relation to their benefit and harm

Sobajima, Makoto

JAERI-Review 2001-011, 90 Pages, 2001/03


no abstracts in English

JAEA Reports

The Specification of Stampi; A Message passing library for distributed parallel computing

Imamura, Toshiyuki; Takemiya, Hiroshi*; Koide, Hiroshi

JAERI-Data/Code 2000-007, p.114 - 0, 2000/03


no abstracts in English

JAEA Reports

Starpc: A Library for communication among tools on a parallel computer cluster; User's and developer's guide for Starpc

Takemiya, Hiroshi*; Yamagishi, Nobuhiro*

JAERI-Data/Code 2000-006, p.172 - 0, 2000/02


no abstracts in English

JAEA Reports

Stampi: A Message passing library for distributed parallel computing; User's guide, second edition

Imamura, Toshiyuki; Koide, Hiroshi; Takemiya, Hiroshi*

JAERI-Data/Code 2000-002, p.75 - 0, 2000/02


no abstracts in English

Journal Articles

An Estimation of complexity and computational costs for vertical block-cyclic distributed parallel LU factorization

Imamura, Toshiyuki

Journal of Supercomputing, 15(1), p.95 - 110, 2000/00

 Times Cited Count:2 Percentile:69.1(Computer Science, Hardware & Architecture)

no abstracts in English

