Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Asahi, Yuichi; Onodera, Naoyuki; Hasegawa, Yuta; Shimokawabe, Takashi*; Shiba, Hayato*; Idomura, Yasuhiro
Boundary-Layer Meteorology, 34 Pages, 2023/01
Times Cited Count:0 Percentile:0.19(Meteorology & Atmospheric Sciences)We develop a Transformer-based deep learning model to predict the plume concentrations in the urban area under uniform flow conditions. Our model has two distinct input layers: Transformer layers for sequential data and convolutional layers in convolutional neural networks (CNNs) for image-like data. Our model can predict the plume concentration from realistically available data such as the time series monitoring data at a few observation stations and the building shapes and the source location. It is shown that the model can give reasonably accurate prediction with orders of magnitude faster than CFD simulations. It is also shown that the exactly same model can be applied to predict the source location, which also gives reasonable prediction accuracy.
Hasegawa, Yuta; Onodera, Naoyuki; Asahi, Yuichi; Idomura, Yasuhiro
Dai-36-Kai Suchi Ryutai Rikigaku Shimpojiumu Koen Rombunshu (Internet), 5 Pages, 2022/12
This study implemented and tested the ensemble data assimilation (DA) of turbulent flows using the lattice Boltzmann method and the local ensemble transform Kalman filter (LBM-LETKF). The computational code was implemented fully on GPUs. The test was carried out for the 3D turbulent flow around a square cylinder with meshes and 32 ensemble members using 32 GPUs. The time interval of the DA in the test was a half of the period of the Kalman vortex shedding. The normalized mean absolute errors (NMAE) of the lift coefficient were 132%, 148%, and 13.2% for the non-DA case, the nudging case (a simpler DA algorithm), and the LETKF case, respectively. It was found that the LETKF achieved good DA accuracy even though the observation was not frequent enough for the small scale turbulence, while the nudging showed systematic delays in its solution, and could not keep the DA accurately.
Onodera, Naoyuki; Idomura, Yasuhiro; Hasegawa, Yuta; Nakayama, Hiromasa
Dai-36-Kai Suchi Ryutai Rikigaku Shimpojiumu Koen Rombunshu (Internet), 3 Pages, 2022/12
We have developed a wind simulation code named CityLBM to realize wind digital twins. Mesoscale wind conditions are given as boundary conditions in CityLBM by using a nudging data assimilation method. It is found that conventional approaches with constant nudging coefficients fail to reproduce turbulent intensity in long time simulations, where atmospheric stability conditions change significantly. We propose a dynamic parameter optimization method for the nudging coefficient based on an ensemble Kalman filter. CityLBM was validated against plume dispersion experiments in the complex urban environment of Oklahoma City. The nudging coefficient was updated to reduce the error of the turbulent intensity between the simulation and the observation. The mean error of velocity variance is reduced by 10% compared to the conventional nudging method with a constant nudging coefficient.
Nakayama, Hiromasa; Onodera, Naoyuki; Satoh, Daiki; Nagai, Haruyasu; Hasegawa, Yuta; Idomura, Yasuhiro
Journal of Nuclear Science and Technology, 59(10), p.1314 - 1329, 2022/10
Times Cited Count:2 Percentile:71.47(Nuclear Science & Technology)We developed a local-scale high-resolution atmospheric dispersion and dose assessment system (LHADDAS) for safety and consequence assessment of nuclear facilities and emergency response to nuclear accidents or deliberate releases of radioactive materials in built-up urban areas. This system is composed of pre-processing of input files, main calculation by local-scale high-resolution atmospheric dispersion model using large-eddy simulation (LOHDIM-LES) and real-time urban dispersion simulation model based on a lattice Boltzmann method (CityLBM), and post-processing of dose-calculation by simulation code powered by lattice dose-response functions (SIBYL). LHADDAS has a broad utility and offers superior performance in (1) simulating turbulent flows, plume dispersion, and dry deposition under realistic meteorological conditions, (2) performing real-time tracer dispersion simulations using a locally mesh-refined lattice Boltzmann method, and (3) estimating air dose rates of radionuclides from air concentrations and surface deposition in consideration of the influence of individual buildings and structures. This system is promising for safety assessment of nuclear facilities as an alternative to wind tunnel experiments, detailed pre/post-analyses of a local-scale radioactive plume dispersion in case of nuclear accidents, and quick response to emergency situations resulting from deliberate release of radioactive materials by a terrorist attack in an urban central district area.
Hasegawa, Yuta; Onodera, Naoyuki; Asahi, Yuichi; Idomura, Yasuhiro
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 27, 4 Pages, 2022/06
We developed GPU implementation of ensemble data assimilation (DA) using the local ensemble transform Kalman filter (LETKF) with the lattice Boltzmann method (LBM). The performance test was carried out upto 32 ensembles of two-dimensional isotropic turbulence simulations using the D2Q9 LBM. The computational cost of the LETKF was less than or nearly equal to that of the LBM upto eight ensembles, while the former exceeded the latter at larger ensembles. At 32 ensembles, their computational costs per cycle were respectively 28.3 msec and 5.39 msec. These results suggested that further speedup of the LETKF is needed for practical 3D LBM simulations.
Sugihara, Kenta; Onodera, Naoyuki; Idomura, Yasuhiro; Yamashita, Susumu
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 27, 5 Pages, 2022/06
The phase-field method has been successfully applied to various multi-phase flow problems as an interface tracking method for gas-liquid interfaces. However, the accuracy of the phase-field method depends on hyper-parameters, which are empirically adjusted for each problem. The phase-field method sustains sharp interfaces by the balance between the numerical viscosity of the advection term and the interface modification by the diffusion and anti-diffusion terms. Based on this fact, we propose a method for deriving the optimal hyper-parameters in a non-empirical manner by performing a basic error analysis of the interface advection.
Onodera, Naoyuki; Idomura, Yasuhiro; Hasegawa, Yuta; Shimokawabe, Takashi*; Aoki, Takayuki*
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 27, 4 Pages, 2022/06
We have developed a wind simulation code named CityLBM to realize wind digital twins. Mesoscale wind conditions are given as boundary conditions in CityLBM by using a nudging data assimilation method. It is found that conventional approaches with constant nudging coefficients fail to reproduce turbulent intensity in long time simulations, where atmospheric stability conditions change significantly. We propose a dynamic parameter optimization method for the nudging coefficient based on a particle filter. CityLBM was validated against plume dispersion experiments in the complex urban environment of Oklahoma City. The nudging coefficient was updated to reduce the error of the turbulent intensity between the simulation and the observation, and the atmospheric boundary layer was reproduced throughout the day.
Asahi, Yuichi; Onodera, Naoyuki; Hasegawa, Yuta; Shimokawabe, Takashi*; Shiba, Hayato*; Idomura, Yasuhiro
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 27, 5 Pages, 2022/06
We have ported the GPU accelerated Lattice Boltzmann Method code "CityLBM" to AMD MI100 GPU. We present the performance of CityLBM achieved on NVIDIA P100, V100, A100 GPUs and AMDMI100 GPU. Using the host to host MPI communications, the performance on MI100 GPU is around 20% better than on V100 GPU. It has turned out that most of the kernels are successfully accelerated except for interpolation kernels for Adaptive Mesh Refinement (AMR) method.
Watanabe, Tomohiko*; Idomura, Yasuhiro; Todo, Yasushi*; Honda, Mitsuru*
Nihon Genshiryoku Gakkai-Shi ATOMO, 64(3), p.152 - 156, 2022/03
Understanding of physical processes of particle, momentum, and thermal transports is essential for predicting the confinement performance of burning plasmas in ITER, which is targeting the scientific demonstration of magnetic confinement fusion. First principles based simulations on Fugaku disclosed physical mechanisms such as complex transport processes of multi-scale turbulence in deuterium-tritium plasmas and kinetic effects in energetic particle transport due to electromagnetic fluctuations. We promote further research and development of first principles based simulations towards the performance prediction of burning plasmas.
Hasegawa, Yuta; Imamura, Toshiyuki*; Ina, Takuya; Onodera, Naoyuki; Asahi, Yuichi; Idomura, Yasuhiro
Proceedings of 13th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH22) (Internet), p.10 - 17, 2022/00
The ensemble data assimilation of computational fluid dynamics simulations based on the lattice Boltzmann method (LBM) and the local ensemble transform Kalman filter (LETKF) is implemented and optimized on a GPU supercomputer based on NVIDIA A100 GPUs. To connect the LBM and LETKF parts, data transpose communication is optimized by overlapping computation, file I/O, and communication based on data dependency in each LETKF kernel. In two dimensional forced isotropic turbulence simulations with the ensemble size of and the number of grid points of
, the optimized implementation achieved
speedup from the naive implementation, in which the LETKF part is not parallelized. The main computing kernel of the local problem is the eigenvalue decomposition (EVD) of
real symmetric dense matrices, which is computed by a newly developed batched EVD in EigenG. The batched EVD in EigenG outperforms that in cuSolver, and
speedup was achieved.
Onodera, Naoyuki; Idomura, Yasuhiro; Hasegawa, Yuta; Nakayama, Hiromasa
Dai-35-Kai Suchi Ryutai Rikigaku Shimpojiumu Koen Rombunshu (Internet), 3 Pages, 2021/12
A detailed wind simulation is very important for designing smart cities. Since a lot of tall buildings and complex structures make the air flow turbulent in urban cities, large-scale CFD simulations are needed. We develop a GPU-based CFD code based on a Lattice Boltzmann Method (LBM) with a block-based Adaptive Mesh Refinement (AMR) method. In order to reproduce real wind conditions, the wind condition and ground temperature of the mesoscale weather forecasting model are given as boundary conditions. In this research, a surface heat flux model based on the Monin-Obukhov similarity theory was introduced to improve the calculation accuracy. We conducted a detailed wind simulation in Oklahoma City. By executing this computation, wind conditions in the urban area were reproduced with good accuracy.
Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro; Onodera, Naoyuki
Parallel Computing, 108, p.102851_1 - 102851_12, 2021/12
Times Cited Count:0 Percentile:0.01(Computer Science, Theory & Methods)The aerodynamics simulation code based on the lattice Boltzmann method (LBM) using forest-of-octrees-based block-structured local mesh refinement (LMR) was implemented, and its performance was evaluated on GPU-based supercomputers. We found that the conventional Space-Filling-Curve-based (SFC) domain partitioning algorithm results in costly halo communication in our aerodynamics simulations. Our new tree cutting approach improved the locality and the topology of the partitioned sub-domains and reduced the communication cost to one-third or one-fourth of the original SFC approach. In the strong scaling test, the code achieved maximum speedup at the performance of 2207 MLUPS (mega- lattice update per second) on 128 GPUs. In the weak scaling test, the code achieved 9620 MLUPS at 128 GPUs with 4.473 billion grid points, while the parallel efficiency was 93.4% from 8 to 128 GPUs.
Ina, Takuya*; Idomura, Yasuhiro; Imamura, Toshiyuki*; Yamashita, Susumu; Onodera, Naoyuki
Proceedings of 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems ScalA21) (Internet) , 8 Pages, 2021/11
Times Cited Count:0 Percentile:0.02A new mixed-precision preconditioner based on the iterative refinement (IR) method is developed for preconditioned conjugate gradient (P-CG) and multigrid preconditioned conjugate gradient (MGCG) solvers in a multi-phase thermal-hydraulic CFD code JUPITER. In the IR preconditioner, all data is stored in FP16 to reduce memory access, while all computation is performed in FP32. The hybrid FP16/32 implementation keeps the similar convergence property as FP32, while the computational performance is close to FP16. The developed solvers are optimized on Fugaku (A64FX), and applied to ill-conditioned matrices in JUPITER. The P-CG and MGCG solvers with the new IR preconditioner show excellent strong scaling up to 8,000 nodes, and at 8,000 nodes, they are respectively accelerated up to 4.86 and 2.39
from the conventional ones on Oakforest-PACS (KNL).
Asahi, Yuichi; Hatayama, Sora*; Shimokawabe, Takashi*; Onodera, Naoyuki; Hasegawa, Yuta; Idomura, Yasuhiro
Proceedings of 2021 IEEE International Conference on Cluster Computing (IEEE Cluster 2021) (Internet), p.686 - 691, 2021/10
Times Cited Count:1 Percentile:73.2We develop a convolutional neural network model to predict the multi-resolution steady flow. Based on the state-of-the-art image-to-image translation model pix2pixHD, our model can predict the high resolution flow field from the set of patched signed distance functions. By patching the high resolution data, the memory requirements in our model is suppressed compared to pix2pixHD.
Matsuoka, Seikichi*; Sugama, Hideo*; Idomura, Yasuhiro
Physics of Plasmas, 28(6), p.064501_1 - 064501_5, 2021/06
Times Cited Count:3 Percentile:55.67(Physics, Fluids & Plasmas)The improved model collision operator proposed by Sugama et al., which can recover the friction-flow relation of the linearized Landau collision operator, is newly implemented in a global full- f gyrokinetic simulation code, GT5D, and collisional transport simulations of a single ion species plasma in a tokamak are performed over the wide collisionality regime. The improved operator is verified to reproduce the theoretical collisional thermal diffusivity precisely in the high collisionality regime, where the friction-flow relation of higher accuracy is required than in the lower collisional regime. In addition, it is found in all collisionality regimes that the higher accuracy of the collisional thermal diffusivity and the parallel flow coefficient is obtained by the improved operator, demonstrating that collisional processes described by the linearized Landau collision operator is correctly retained.
Onodera, Naoyuki; Idomura, Yasuhiro; Hasegawa, Yuta; Nakayama, Hiromasa; Shimokawabe, Takashi*; Aoki, Takayuki*
Boundary-Layer Meteorology, 179(2), p.187 - 208, 2021/05
Times Cited Count:8 Percentile:83.12(Meteorology & Atmospheric Sciences)A plume dispersion simulation code named CityLBM enables a real time simulation for several km by applying adaptive mesh refinement (AMR) method on GPU supercomputers. We assess plume dispersion problems in the complex urban environment of Oklahoma City (JU2003). Realistic mesoscale wind boundary conditions of JU2003 produced by a Weather Research and Forecasting Model (WRF), building structures, and a plant canopy model are introduced to CityLBM. Ensemble calculations are performed to reduce turbulence uncertainties. The statistics of the plume dispersion field, mean and max concentrations show that ensemble calculations improve the accuracy of the estimation, and the ensemble-averaged concentration values in the simulations over 4 km areas with 2-m resolution satisfied factor 2 agreements for 70% of 24 target measurement points and periods in JU2003.
Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro; Onodera, Naoyuki
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 26, 6 Pages, 2021/05
We introduce an improved domain partitioning method called "tree cutting approach" for the aerodynamics simulation code based on the lattice Boltzmann method (LBM) with the forest-of-octrees-based local mesh refinement (LMR). The conventional domain partitioning algorithm based on the space-filling curve (SFC), which is widely used in LMR, caused a costly halo data communication which became a bottleneck of our aerodynamics simulation on the GPU-based supercomputers. Our tree cutting approach adopts a hybrid domain partitioning with the coarse structured block decomposition and the SFC partitioning in each block. This hybrid approach improved the locality and the topology of the partitioned sub-domains and reduced the amount of the halo communication to one-third of the original SFC approach. The code achieved speedup on 8 GPUs, and achieved
speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs with strong scaling test.
Onodera, Naoyuki; Idomura, Yasuhiro; Hasegawa, Yuta; Shimokawabe, Takashi*; Aoki, Takayuki*
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 26, 3 Pages, 2021/05
We develop a mixed-precision preconditioner for the pressure Poisson equation in a two-phase flow CFD code JUPITER-AMR. The multi-grid (MG) preconditioner is constructed based on the geometric MG method with a three- stage V-cycle, and a cache-reuse SOR (CR-SOR) method at each stage. The numerical experiments are conducted for two-phase flows in a fuel bundle of a nuclear reactor. The MG-CG solver in single-precision shows the same convergence histories as double-precision, which is about 75% of the computational time in double-precision. In the strong scaling test, the MG-CG solver in single-precision is accelerated by 1.88 times between 32 and 96 GPUs.
Asahi, Yuichi; Hatayama, Sora*; Shimokawabe, Takashi*; Onodera, Naoyuki; Hasegawa, Yuta; Idomura, Yasuhiro
Keisan Kogaku Koenkai Rombunshu (CD-ROM), 26, 4 Pages, 2021/05
We develop a convolutional neural network model to predict the multi-resolution steady flow. Based on the state-of-the-art image-to-image translation model Pix2PixHD, our model can predict the high resolution flow field from the signed distance function. By patching the high resolution data, the memory requirements in our model is suppressed compared to Pix2PixHD.
Idomura, Yasuhiro; Obrejan, K.*; Asahi, Yuichi; Honda, Mitsuru*
Physics of Plasmas, 28(1), p.012501_1 - 012501_11, 2021/01
Times Cited Count:3 Percentile:55.67(Physics, Fluids & Plasmas)Tracer impurity transport in ion temperature gradient driven (ITG) turbulence is investigated using a global full- gyrokinetic simulation including kinetic electrons, bulk ions, and low to medium
tracer impurities, where
is the charge number. It is found that in addition to turbulent particle transport, enhanced neoclassical particle transport due to a new synergy effect between turbulent and neoclassical transports makes a significant contribution to tracer impurity transport. Bursty excitation of the ITG mode generates non-ambipolar turbulent particle fluxes of electrons and bulk ions, leading to a fast growth of the radial electric field following the ambipolar condition. The divergence of
flows compresses up-down asymmetric density perturbations, which are subject to transport induced by the magnetic drift. The enhanced neoclassical particle transport depends on the ion mass, because the magnitude of up-down asymmetric density perturbation is determined by a competition between the
compression effect and the return current given by the parallel streaming motion. This mechanism does not work for the temperature, and thus, selectively enhances only particle transport.