Element-wise implementation of iterative solvers for FEM problems on the cell processor; An Optimization of the FEM for a low B/F ratio processor

Kushida, Noriyuki

I introduced a new implementation of the finite element method (FEM) that is suitable for the cell processors. Since the cell processors have a far greater performance and low byte per flop (B/F) rate than traditional scalar processors, I reduced the amount of memory transfer and employed memory access times hiding technique. The amount of memory transfer was reduced by accepting additional floating-point operations by not storing data if it was required repeatedly. In this study, such memory access reduction was applied to conjugate gradient method (CG). In order to achieve memory access reduction in CG, element-wise computation was employed for avoiding global coefficient matrices that causes frequent memory access. Moreover, all data transfer times were concealed behind the calculation time. As a result, my new implementation performed 10 times better than a traditional implementation that run on a PPU.



