Refine your search:     
Report No.
 - 
Search Results: Records 1-7 displayed on this page of 7
  • 1

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Journal Articles

Tree cutting approach for domain partitioning on forest-of-octrees-based block-structured static adaptive mesh refinement with lattice Boltzmann method

Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro; Onodera, Naoyuki

Parallel Computing, 108, p.102851_1 - 102851_12, 2021/12

 Times Cited Count:1 Percentile:34.17(Computer Science, Theory & Methods)

The aerodynamics simulation code based on the lattice Boltzmann method (LBM) using forest-of-octrees-based block-structured local mesh refinement (LMR) was implemented, and its performance was evaluated on GPU-based supercomputers. We found that the conventional Space-Filling-Curve-based (SFC) domain partitioning algorithm results in costly halo communication in our aerodynamics simulations. Our new tree cutting approach improved the locality and the topology of the partitioned sub-domains and reduced the communication cost to one-third or one-fourth of the original SFC approach. In the strong scaling test, the code achieved maximum $$times1.82$$ speedup at the performance of 2207 MLUPS (mega- lattice update per second) on 128 GPUs. In the weak scaling test, the code achieved 9620 MLUPS at 128 GPUs with 4.473 billion grid points, while the parallel efficiency was 93.4% from 8 to 128 GPUs.

Journal Articles

Improved domain partitioning on tree-based mesh-refined lattice Boltzmann method

Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro; Onodera, Naoyuki

Keisan Kogaku Koenkai Rombunshu (CD-ROM), 26, 6 Pages, 2021/05

We introduce an improved domain partitioning method called "tree cutting approach" for the aerodynamics simulation code based on the lattice Boltzmann method (LBM) with the forest-of-octrees-based local mesh refinement (LMR). The conventional domain partitioning algorithm based on the space-filling curve (SFC), which is widely used in LMR, caused a costly halo data communication which became a bottleneck of our aerodynamics simulation on the GPU-based supercomputers. Our tree cutting approach adopts a hybrid domain partitioning with the coarse structured block decomposition and the SFC partitioning in each block. This hybrid approach improved the locality and the topology of the partitioned sub-domains and reduced the amount of the halo communication to one-third of the original SFC approach. The code achieved $$times 1.23$$ speedup on 8 GPUs, and achieved $$times 1.82$$ speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs with strong scaling test.

JAEA Reports

Development of production methods of the Sr-90/Y-90 source for hydrogen production experiments

Motoki, Riyozo; Aoki, Hiromichi; Uchida, Shoji; Nagaishi, Ryuji; Yamada, Reiji

JAEA-Technology 2008-014, 23 Pages, 2008/03

JAEA-Technology-2008-014.pdf:9.05MB

The study of producing hydrogen with a Sr-90/Y-90 source is planned to utilze the radioactive waste effectively. Therefore we developed two methods of caking Sr-90 and a catalyst for the production of hydrogen effectively. One is a method of caking $$^{90}$$SrTiO$$_{3}$$ and TiO$$_{2}$$ in a silica gel. And another is a method of caking $$^{90}$$SrSO$$_{4}$$ and TiO$$_{2}$$ in a silica gel. These solid matters are porous materials, which has a radiation resistant and chemical resistant. In addition, Y-90 which is a daughter nuclide of Sr-90 can be also used for hydrogen production.

JAEA Reports

Analytical work at NUCEF in FY 2006

Sakazume, Yoshinori; Aoki, Hiromichi; Haga, Takahisa; Fukaya, Hiroyuki; Sonoda, Takashi; Shimizu, Kaori; Niitsuma, Yasushi*; Ito, Mitsuo; Inoue, Takeshi

JAEA-Technology 2007-069, 44 Pages, 2008/02

JAEA-Technology-2007-069.pdf:4.55MB

Analysis of the uranyl nitrate solution fuel is carried out at the analytical laboratory of NUCEF (Nuclear Fuel Cycle Engineering Research Facility), which provides essential data for operation of STACY (Static Experiment Critical Facility), TRACY (Transient Experiment Critical Facility) and the fuel treatment system. Analyzed in FY 2006 were uranyl nitrate solution fuel samples taken before and after experiments of STACY and TRACY, samples for the preparation of uranyl nitrate solution fuel, and samples for nuclear material accountancy purpose. The total number of the samples analyzed in FY 2006 was 254. This report summarizes work related to the analysis and management of the analytical laboratory in the FY 2006.

JAEA Reports

Analytical work at NUCEF in FY 2005

Fukaya, Hiroyuki; Aoki, Hiromichi; Haga, Takahisa; Nishizawa, Hidetoshi; Sonoda, Takashi; Sakazume, Yoshinori; Shimizu, Kaori; Niitsuma, Yasushi*; Shirahashi, Koichi; Inoue, Takeshi

JAEA-Technology 2007-005, 27 Pages, 2007/03

JAEA-Technology-2007-005.pdf:1.97MB

Analysis of the uranyl nitrate solution fuel is carried out at the analytical laboratory of NUCEF (Nuclear Fuel Cycle Engineering Research Facility), which provides essential data for operation of STACY (Static Experiment Critical Facility), TRACY (Transient Experiment Critical Facility) and the fuel treatment system. Analyzed in FY 2005 were uranyl nitrate solution fuel samples taken before and after experiments of STACY and TRACY, samples for the preparation of uranyl nitrate solution fuel, and samples for nuclear material accountancy purpose. Also analyzed were the samples from raffinate treatment and its preliminary tests. The raffinate was generated, since FY 2000, during preliminary experiments on U/Pu extraction-pulification to fix the operation condition to prepare plutonium solution fuel to be used for criticality experiments at STACY. This report summarizes work related to the analysis and management of the analytical laboratory in the FY 2005.

Oral presentation

Aerodynamics simulations on a road racing of bicycles using multi-GPU Large Eddy Simulation based on mesh-refined lattice Boltzmann method

Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Shirasaki, Keita*

no journal, , 

We implement and perform a large-scale LES simulation for aerodynamics on a road racing of bicycles. The mesh-refined lattice Boltzmann method with coherent-structure Smagorinsky model is employed to perform a multi-GPU computing. The validation on alone running or group running of 4 cyclists had a good agreement with previous experiments or CFD simulations. As a large scale benchmark problem, the aerodynamics simulation on the group of 72 cyclists was performed using 192 GPUs for 4 days. This computational cost is enough reasonable to run the application studies.

Oral presentation

Tree cutting approach for reducing communication in domain partitioning of tree-based block-structured adaptive mesh refinement

Hasegawa, Yuta; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro; Onodera, Naoyuki

no journal, , 

We developed a block-structured static adaptive mesh refinement (AMR) CFD code for the aerodynamics simulation using the lattice Boltzmann method on GPU supercomputers. The data structure of AMR was based on the forest-of-octrees, and the domain partitioning algorithm was based on space-filling curves (SFCs). To reduce the halo data communication, we introduced the tree cutting approach, which divided the global domains with a few octrees into small sub-domains with many octrees, leading to a hierarchical domain partitioning approach with the coarse structured block and the fine SFC partitioning in each block. The tree cutting improved the locality of the sub-divided domain, and reduced both the amount of communication data and the number of connections of the halo communication. In the strong scaling test on the Tesla V100 GPU supercomputer, the tree cutting approach showed $$times$$1.82 speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs.

7 (Records 1-7 displayed on this page)
  • 1