Refine your search:     
Report No.

Development of optimization of stencil calculation on Tera-flops many-core architecture

Ina, Takuya; Asahi, Yuichi  ; Idomura, Yasuhiro  

Plasma turbulence simulation is requiring significant computational resources. In particular, in order to simulation of the International Thermonuclear Experimental Reactor ITER scale is essential to the Exa-scale machine. Exa-scale machine architecture is undecided, but it is believed that the architecture of the existing is based. The purpose of this study is to establish the optimization techniques of stencil calculations for Xeon phi, GPU and FX100. These architecture is teraflops-class computing performance. The dynamic scheduling and change from multi loop to single loop for the Xeon phi. Reuse of the Register and avoid warp divergence for the GPU. The promotion of the software prefetch for reuse L1 cache and L2 cache by adjusting the chunk size for the FX100. Performance is improved by applying the optimization to the Xeon phi, GPU and FX100. We confirmed the effective optimization method of stencil calculation for Xeon phi, GPU and FX100.



- Accesses





[CLARIVATE ANALYTICS], [WEB OF SCIENCE], [HIGHLY CITED PAPER & CUP LOGO] and [HOT PAPER & FIRE LOGO] are trademarks of Clarivate Analytics, and/or its affiliated company or companies, and used herein by permission and/or license.