Refine your search:     
Report No.
 - 
Search Results: Records 1-20 displayed on this page of 40

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Journal Articles

AMR-Net: Convolutional neural networks for multi-resolution steady flow prediction

Asahi, Yuichi; Hatayama, Sora*; Shimokawabe, Takashi*; Onodera, Naoyuki; Hasegawa, Yuta; Idomura, Yasuhiro

Proceedings of 2021 IEEE International Conference on Cluster Computing (IEEE Cluster 2021) (Internet), p.686 - 691, 2021/10

We develop a convolutional neural network model to predict the multi-resolution steady flow. Based on the state-of-the-art image-to-image translation model pix2pixHD, our model can predict the high resolution flow field from the set of patched signed distance functions. By patching the high resolution data, the memory requirements in our model is suppressed compared to pix2pixHD.

Journal Articles

Multi-resolution steady flow prediction with convolutional neural networks

Asahi, Yuichi; Hatayama, Sora*; Shimokawabe, Takashi*; Onodera, Naoyuki; Hasegawa, Yuta; Idomura, Yasuhiro

Keisan Kogaku Koenkai Rombunshu (CD-ROM), 26, 4 Pages, 2021/05

We develop a convolutional neural network model to predict the multi-resolution steady flow. Based on the state-of-the-art image-to-image translation model Pix2PixHD, our model can predict the high resolution flow field from the signed distance function. By patching the high resolution data, the memory requirements in our model is suppressed compared to Pix2PixHD.

Journal Articles

Data-driven analyses of avalanche like turbulent transport phenomena

Asahi, Yuichi; Fujii, Keisuke*

Purazuma, Kaku Yugo Gakkai-Shi, 97(2), p.86 - 92, 2021/02

The 5D gyrokinetic simulation data has been analyzed with the data-driven analysis methods. By defining an entropy-like quantity with singular values, we have quantitatively evaluated the randomness of the plasma state. We found that the randomness of plasma increases after the avalanche like transport and then gradually decrease. Since the decrease of the randomness is expected to be relevant to the phase space structure formation, we have developed a method to extract the phase space structures from the time series of 5D data. The relationship between the avalanche-like transport and phase space structures is discussed based on the contribution of each principal component to the energy transport.

Journal Articles

Dynamics of enhanced neoclassical particle transport of tracer impurity ions in ion temperature gradient driven turbulence

Idomura, Yasuhiro; Obrejan, K.*; Asahi, Yuichi; Honda, Mitsuru*

Physics of Plasmas, 28(1), p.012501_1 - 012501_11, 2021/01

 Times Cited Count:3 Percentile:95.53(Physics, Fluids & Plasmas)

Tracer impurity transport in ion temperature gradient driven (ITG) turbulence is investigated using a global full-$$f$$ gyrokinetic simulation including kinetic electrons, bulk ions, and low to medium $$Z$$ tracer impurities, where $$Z$$ is the charge number. It is found that in addition to turbulent particle transport, enhanced neoclassical particle transport due to a new synergy effect between turbulent and neoclassical transports makes a significant contribution to tracer impurity transport. Bursty excitation of the ITG mode generates non-ambipolar turbulent particle fluxes of electrons and bulk ions, leading to a fast growth of the radial electric field following the ambipolar condition. The divergence of $$Etimes B$$ flows compresses up-down asymmetric density perturbations, which are subject to transport induced by the magnetic drift. The enhanced neoclassical particle transport depends on the ion mass, because the magnitude of up-down asymmetric density perturbation is determined by a competition between the $$Etimes B$$ compression effect and the return current given by the parallel streaming motion. This mechanism does not work for the temperature, and thus, selectively enhances only particle transport.

Journal Articles

Performance evaluation of block-structured Poisson solver on GPU, CPU, and ARM processors

Onodera, Naoyuki; Idomura, Yasuhiro; Asahi, Yuichi; Hasegawa, Yuta; Shimokawabe, Takashi*; Aoki, Takayuki*

Dai-34-Kai Suchi Ryutai Rikigaku Shimpojiumu Koen Rombunshu (Internet), 2 Pages, 2020/12

We develop a multigrid preconditioned conjugate gradient (MG-CG) solver for the pressure Poisson equation in a two-phase flow CFD code JUPITER. The code is written in C++ and CUDA to keep the portability on multi-platforms. The main kernels of the CG solver achieve reasonable performance as 0.4 $$sim$$ 0.75 of the roofline performances, and the performances of the MG-preconditioner are also reasonable on NVIDIA GPU and Intel CPU. However, the performance degradation of the SpMV kernel on ARM is significant. It is confirmed that the optimization does not work if any functions are included in the loop.

Journal Articles

Performance portable implementation of a kinetic plasma simulation mini-app with a higher level abstraction and directives

Asahi, Yuichi; Latu, G.*; Bigot, J.*; Grandgirard, V.*

Proceedings of Joint International Conference on Supercomputing in Nuclear Applications + Monte Carlo 2020 (SNA + MC 2020), p.218 - 224, 2020/10

Performance portability is expected to be a critical issue in the upcoming exascale era. We explore a performance portable approach for a fusion plasma turbulence simulation code employing the kinetic model, namely the GYSELA code. For this purpose, we extract the key features of GYSELA such as the high dimensionality (more than 4D) and the semi-Lagrangian scheme, and encapsulate them into a mini-application which solves the similar but a simplified Vlasov-Poisson system as GYSELA. We implement the mini-app with OpenACC, OpenMP4.5 and Kokkos, where we suppress unnecessary duplications of code lines. Based on our experience, we discuss the advantages and disadvantages of OpenACC, OpenMP4.5 and Kokkos, from the view point of performance portability, readability and productivity.

Journal Articles

Overlapping communications in gyrokinetic codes on accelerator-based platforms

Asahi, Yuichi*; Latu, G.*; Bigot, J.*; Maeyama, Shinya*; Grandgirard, V.*; Idomura, Yasuhiro

Concurrency and Computation; Practice and Experience, 32(5), p.e5551_1 - e5551_21, 2020/03

 Times Cited Count:0 Percentile:0.01(Computer Science, Software Engineering)

Two five-dimensional gyrokinetic codes GYSELA and GKV were ported to the modern accelerators, Xeon Phi KNL and Tesla P100 GPU. Serial computing kernels of GYSELA on KNL and GKV on P100 GPU were respectively 1.3x and 7.4x faster than those on a single Skylake processor. Scaling tests of GYSELA and GKV were respectively performed from 16 to 512 KNLs and from 32 to 256 P100 GPUs, and data transpose communications in semi-Lagrangian kernels in GYSELA and in convolution kernels in GKV were found to be main bottlenecks, respectively. In order to mitigate the communication costs, pipeline-based and task-based communication overlapping were implemented in these codes.

Journal Articles

Synergy of turbulent and neoclassical transport through poloidal convective cells

Asahi, Yuichi*; Grandgirard, V.*; Sarazin, Y.*; Donnel, P.*; Garbet, X.*; Idomura, Yasuhiro; Dif-Pradalier, G.*; Latu, G.*

Plasma Physics and Controlled Fusion, 61(6), p.065015_1 - 065015_15, 2019/05

 Times Cited Count:3 Percentile:40.69(Physics, Fluids & Plasmas)

The role of poloidal convective cells on transport processes is studied with the full-F gyrokinetic code GYSELA. For this purpose, we apply a numerical filter to convective cells and compare the simulation results with and without the filter. The energy flux driven by the magnetic drifts turns out to be reduced by a factor of about 2 once the numerical filter is applied. A careful analysis reveals that the frequency spectrum of the convective cells is well-correlated with that of the turbulent Reynolds stress tensor, giving credit to their turbulence-driven origin. The impact of convective cells can be interpreted as a synergy between turbulence and neoclassical dynamics.

Journal Articles

Turbulent generation of poloidal asymmetries of the electric potential in a tokamak

Donnel, P.*; Garbet, X.*; Sarazin, Y.*; Asahi, Yuichi; Wilczynski, F.*; Caschera, E.*; Dif-Pradalier, G.*; Ghendrih, P.*; Gillot, C.*

Plasma Physics and Controlled Fusion, 61(1), p.014003_1 - 014003_11, 2019/01

 Times Cited Count:7 Percentile:74.25(Physics, Fluids & Plasmas)

Poloidal asymmetries of the $$E times B$$ plasma flow are known to play a role in neoclassical transport. According to conventional neoclassical theory, the level of poloidal asymmetry of the electric potential is expected to be very small. In the present work, a general framework for the generation of axisymmetric structures of potential by turbulence is presented. Zonal flows, geodesic acoustic modes and convective cells are described by a single model. This is done by solving the gyrokinetic equation coupled to the quasi-neutrality equation. This calculation provides a predictive calculation of the frequency spectrum of flows given a specified forcing due to turbulence. It also shows that the dominant mechanism comes from zonal flow compression at intermediate frequencies, while ballooning of the turbulence Reynolds stress appears to be the main drive at low frequency.

Journal Articles

Application of a communication-avoiding generalized minimal residual method to a gyrokinetic five dimensional Eulerian code on many core platforms

Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu; Matsumoto, Kazuya*; Asahi, Yuichi*; Imamura, Toshiyuki*

Proceedings of 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA 2017), p.7_1 - 7_8, 2017/11

A communication-avoiding generalized minimal residual (CA-GMRES) method is applied to the gyrokinetic toroidal five dimensional Eulerian code GT5D, and its performance is compared against the original code with a generalized conjugate residual (GCR) method on the JAEA ICEX (Haswell), the Plasma Simulator (FX100), and the Oakforest-PACS (KNL). The CA-GMRES method has $$sim 3.8times$$ higher arithmetic intensity than the GCR method, and thus, is suitable for future Exa-scale architectures with limited memory and network bandwidths. In the performance evaluation, it is shown that compared with the GCR solver, its computing kernels are accelerated by $$1.47times sim 2.39times$$, and the cost of data reduction communication is reduced from $$5%sim 13%$$ to $$sim1%$$ of the total cost at 1,280 nodes.

Journal Articles

Benchmarking of flux-driven full-F gyrokinetic simulations

Asahi, Yuichi*; Grandgirard, V.*; Idomura, Yasuhiro; Garbet, X.*; Latu, G.*; Sarazin, Y.*; Dif-Pradalier, G.*; Donnel, P.*; Ehrlacher, C.*

Physics of Plasmas, 24(10), p.102515_1 - 102515_17, 2017/10

AA2017-0418.pdf:4.26MB

 Times Cited Count:3 Percentile:23.72(Physics, Fluids & Plasmas)

Two full-F global gyrokinetic codes are benchmarked to compute flux-driven ion temperature gradient turbulence in tokamak plasmas. For this purpose, the Semi-Lagrangian code GYSELA and the Eulerian code GT5D are employed, which solve the full-F gyrokinetic equation with a realistic fixed flux condition. Using the appropriate settings for the boundary and initial conditions, flux-driven ITG turbulence simulations are carried out. The avalanche-like transport is assessed with a focus on spatio-temporal properties. A statistical analysis is performed to discuss this self-organized criticality (SOC) like behaviors, where we found $$1/f$$ spectra and a transition to $$1/f^3$$ spectra at high-frequency side in both codes. Based on these benchmarks, it is verified that the SOC-like behavior is robust and not dependent on numerics.

Journal Articles

Optimization of fusion kernels on accelerators with indirect or strided memory access patterns

Asahi, Yuichi*; Latu, G.*; Ina, Takuya; Idomura, Yasuhiro; Grandgirard, V.*; Garbet, X.*

IEEE Transactions on Parallel and Distributed Systems, 28(7), p.1974 - 1988, 2017/07

 Times Cited Count:4 Percentile:50.79(Computer Science, Theory & Methods)

High-dimensional stencil computation from fusion plasma turbulence codes involving complex memory access patterns, the indirect memory access in a Semi-Lagrangian scheme and the strided memory access in a Finite-Difference scheme, are optimized on accelerators such as GPGPUs and Xeon Phi coprocessors. On both devices, the Array of Structure of Array (AoSoA) data layout is preferable for contiguous memory accesses. It is shown that the effective local cache usage by improving spatial and temporal data locality is critical on Xeon Phi. On GPGPU, the texture memory usage improves the performance of the indirect memory accesses in the Semi-Lagrangian scheme. Thanks to these optimizations, the fusion kernels on accelerators become 1.4x - 8.1x faster than those on Sandy Bridge (CPU).

Journal Articles

Computational challenges towards Exa-scale fusion plasma turbulence simulations

Idomura, Yasuhiro; Asahi, Yuichi; Ina, Takuya; Matsuoka, Seikichi

Proceedings of 24th International Congress of Theoretical and Applied Mechanics (ICTAM 2016), p.3106 - 3107, 2016/08

Turbulent transport in fusion plasmas is one of key issues in ITER. To address this issue via the five dimensional (5D) gyrokinetic model, a novel computing technique is developed, and strong scaling of the Gyrokinetic Toroidal 5D Eulerian code GT5D is improved up to $$sim 0.6$$ million cores on the K-computer. The computing technique consists of multi-dimensional/multi-layer domain decomposition, overlap of communication and computation, and optimization of computing kernels for multi-core CPUs. The computing power enabled us to study ITER relevant issues such as the plasma size scaling of turbulent transport. Towards the next generation burning plasma turbulence simulations, the physics model is extended including kinetic electrons and multi-species ions, and computing kernels are further optimized for the latest many-core architectures.

Journal Articles

Erosion of $$N$$=20 shell in $$^{33}$$Al investigated through the ground-state electric quadrupole moment

Shimada, Kenji*; Ueno, Hideki*; Neyens, G.*; Asahi, Koichiro*; Balabanski, D. L.*; Daugas, J. M.*; Depuydt, M.*; De Rydt, M.*; Gaudefroy, L.*; Gr$'e$vy, S.*; et al.

Physics Letters B, 714(2-5), p.246 - 250, 2012/08

 Times Cited Count:7 Percentile:44.61(Astronomy & Astrophysics)

no abstracts in English

Journal Articles

Precision measurement of the electric quadrupole moment of $$^{31}$$Al and determination of the effective proton charge in the sd-shell

De Rydt, M.*; Neyens, G.*; Asahi, Koichiro*; Balabanski, D. L.*; Daugas, J. M.*; Depuydt, M.*; Gaudefroy, L.*; Gr$'e$vy, S.*; Hasama, Yuka*; Ichikawa, Yuichi*; et al.

Physics Letters B, 678(4), p.344 - 349, 2009/07

 Times Cited Count:14 Percentile:67.48(Astronomy & Astrophysics)

no abstracts in English

Oral presentation

The Influence of trapped electron mode driven zonal flow on electron temperature gradient driven turbulence

Asahi, Yuichi; Ishizawa, Akihiro*; Watanabe, Tomohiko*; Sugama, Hideo*; Tsutsui, Hiroaki*; Iio, Shunji*

no journal, , 

Turbulent transport driven by electron temperature gradient (ETG) modes and trapped electron modes (TEMs) is investigated by means of gyrokinetic simulations. It is found that ETG turbulence can be suppressed by zonal flows driven by TEMs. Then, the mechanism of the regulation of ETG turbulence by zonal flows is investigated by nonlinear entropy transfer analysis. Firstly, it is confirmed that the entropy is transferred from TEMs to zonal flow. Secondly, it is found that the zonal flows in the steady state meditate the entropy transfer of the ETG modes from low to high radial wavenumber regions. In short, it is quantitatively shown that the zonal flows is driven by TEMs and the ETG turbulence is regulated by the TEM-driven zonal flows.

Oral presentation

Optimization of fusion plasma codes

Asahi, Yuichi; Latu, G.*; Ina, Takuya; Idomura, Yasuhiro; Virginie, G.*; Garbet, X.*

no journal, , 

We present the optimization of kernels from fusion plasma codes, GYSELA and GT5D, on Tera-flops many-core architecturesincluding accelerators (Xeon Phi, GPU), and a multi-core CPUs (FX100). GYSELA kernel is based on a semi-Lagrangian scheme with high arithmetic intensity. Through the optimization of GYSELA kernel on Xeon Phi, we show the importance of the vectorization on Xeon Phi. For GT5D kernel, which is based on a finite difference scheme, a sophisticated memory access is necessary for high performance. Through the optimization of GT5D kernel on GPUs, we show the effective optimization for memory access on GPUs.

Oral presentation

Optimization of fusion plasma turbulence code on GPU

Asahi, Yuichi; Idomura, Yasuhiro; Ina, Takuya

no journal, , 

Because of its collision-less characteristics, fusion plasmas have fine structures in velocity space. Thus, for the analysis of fusion plasma turbulence which degrades plasma confinement, the five dimensional kinetic models are often employed rather than the usual three dimensional fluid model. In this study, we present the optimization of a fusion plasma code, GT5D, which employs a finite difference method. We then discuss the optimization techniques effective for high dimensional stencil based calculations.

Oral presentation

Optimization of stencil-based fusion kernels on Tera-flops many-core architectures

Asahi, Yuichi; Latu, G.*; Ina, Takuya; Idomura, Yasuhiro; Grandgirard, V.*; Garbet, X.*

no journal, , 

We present the optimization of kernels from fusion plasma codes, GYSELA and GT5D, on Tera-flops many-core architectures including accelerators (Xeon Phi, GPU), and a multi-core CPU (FX100). GYSELA kernel is based on a semi-Lagrangian scheme with high arithmetic intensity. Through the optimization of GYSELA kernel on Xeon Phi, we show the importance of the vectorization of a code. For GT5D kernel, which is based on a finite difference scheme, a sophisticated memory access is necessary for attaining high performance. Through the optimization of GT5D kernel on GPUs, we show the effective optimization for memory access with the help of the shared memory.

Oral presentation

Development of optimization of stencil calculation on Tera-flops many-core architecture

Ina, Takuya; Asahi, Yuichi; Idomura, Yasuhiro

no journal, , 

Plasma turbulence simulation is requiring significant computational resources. In particular, in order to simulation of the International Thermonuclear Experimental Reactor ITER scale is essential to the Exa-scale machine. Exa-scale machine architecture is undecided, but it is believed that the architecture of the existing is based. The purpose of this study is to establish the optimization techniques of stencil calculations for Xeon phi, GPU and FX100. These architecture is teraflops-class computing performance. The dynamic scheduling and change from multi loop to single loop for the Xeon phi. Reuse of the Register and avoid warp divergence for the GPU. The promotion of the software prefetch for reuse L1 cache and L2 cache by adjusting the chunk size for the FX100. Performance is improved by applying the optimization to the Xeon phi, GPU and FX100. We confirmed the effective optimization method of stencil calculation for Xeon phi, GPU and FX100.

40 (Records 1-20 displayed on this page)