Refine your search:     
Report No.

Performance measurement of an urban wind simulation with lattice Boltzmann method on Tesla A100 GPUs

Hasegawa, Yuta   ; Onodera, Naoyuki   ; Asahi, Yuichi   ; Idomura, Yasuhiro   

We are developing a real-time urban wind simulation code called CityLBM. In this paper, a performance measurement of the CityLBM was carried out using Tesla A100 GPUs. To optimize the communication with heterogeneous network architectures of intra-node (NVlink) and inter-node (Infiniband) connection, we designed blocked two dimensional domain partitioning with 2 $$times$$ 2 or 2 $$times$$ 4 subdomains, which are confined within each node. The strong scaling with 2.4 billion grids was tested. The result showed good strong scalability and performance, leading to $$times$$ 2.81 speedup from 80 GPUs to 256 GPUs and $$times$$ 1.15 speedup with the blocked domain partitioning. Finally, the simulation with 1 m resolution and 5.7 km $$times$$ 5.7 km horizontal region exceeded the real-time performance, where the computational speed was $$times1.32$$ faster than the real-time.



- Accesses





[CLARIVATE ANALYTICS], [WEB OF SCIENCE], [HIGHLY CITED PAPER & CUP LOGO] and [HOT PAPER & FIRE LOGO] are trademarks of Clarivate Analytics, and/or its affiliated company or companies, and used herein by permission and/or license.