Report No.

Improved domain partitioning on tree-based mesh-refined lattice Boltzmann method

Hasegawa, Yuta  ; Aoki, Takayuki*; Kobayashi, Hiromichi*; Idomura, Yasuhiro  ; Onodera, Naoyuki  

We introduce an improved domain partitioning method called "tree cutting approach" for the aerodynamics simulation code based on the lattice Boltzmann method (LBM) with the forest-of-octrees-based local mesh refinement (LMR). The conventional domain partitioning algorithm based on the space-filling curve (SFC), which is widely used in LMR, caused a costly halo data communication which became a bottleneck of our aerodynamics simulation on the GPU-based supercomputers. Our tree cutting approach adopts a hybrid domain partitioning with the coarse structured block decomposition and the SFC partitioning in each block. This hybrid approach improved the locality and the topology of the partitioned sub-domains and reduced the amount of the halo communication to one-third of the original SFC approach. The code achieved $$times 1.23$$ speedup on 8 GPUs, and achieved $$times 1.82$$ speedup at the performance of 2207 MLUPS (mega-lattice update per second) on 128 GPUs with strong scaling test.



