Running large‐scale CFD applications on Intel‐KNL–based clusters

Latest revision as of 11:14, 30 June 2020

Abstract

Intel's latest Xeon Phi processor, Knights Landing (KNL), has the potential to provide over 2.6 TFLOPS. However, to obtain maximum performance on the KNL, significant refactoring and optimization of application codes is still required to exploit key architectural innovations that KNL features – wide vector units, many-core node design, and deep memory hierarchy. The experience and insights gained in porting and running FEFLO (a typical edge-based Finite Element code for the solution of compressible and incompressible FLOws) on the KNL platform are described in this paper. In particular, optimizations used to extract on-node parallelism via vectorization and multithreading, and improve inter-node communication are considered. These optimizations resulted in a 2.3X performance gain on a 16 node run of FEFLO, with the potential for larger performance gains as the code is scaled beyond 16 nodes. The impact of the different configurations of KNL's on-package MCDRAM (Multi-Channel DRAM) memory on FEFLO's performance is also explored. Finally, the performance of the optimized versions of FEFLO for KNL and Haswell (Intel Xeon) are compared.

Latest revision as of 11:14, 30 June 2020

Abstract

Document information

Document Score

Share this document

claim authorship

Revision as of 11:14, 30 June 2020 (view source) Cinmemj (talk \| contribs) (Tag: Visual edit: Switched) ← Older edit	Latest revision as of 11:14, 30 June 2020 (view source) Cinmemj (talk \| contribs) m (Cinmemj moved page Draft Samper 893439074 to Tiwari et al 2018a)
(No difference)