m (Cinmemj moved page Draft Samper 893439074 to Tiwari et al 2018a)
 
(No difference)

Latest revision as of 11:14, 30 June 2020

Abstract

Intel's latest Xeon Phi processor, Knights Landing (KNL), has the potential to provide over 2.6 TFLOPS. However, to obtain maximum performance on the KNL, significant refactoring and optimization of application codes is still required to exploit key architectural innovations that KNL features – wide vector units, many-core node design, and deep memory hierarchy. The experience and insights gained in porting and running FEFLO (a typical edge-based Finite Element code for the solution of compressible and incompressible FLOws) on the KNL platform are described in this paper. In particular, optimizations used to extract on-node parallelism via vectorization and multithreading, and improve inter-node communication are considered. These optimizations resulted in a 2.3X performance gain on a 16 node run of FEFLO, with the potential for larger performance gains as the code is scaled beyond 16 nodes. The impact of the different configurations of KNL's on-package MCDRAM (Multi-Channel DRAM) memory on FEFLO's performance is also explored. Finally, the performance of the optimized versions of FEFLO for KNL and Haswell (Intel Xeon) are compared.

Back to Top

Document information

Published on 01/01/2018

DOI: 10.1002/fld.4474
Licence: CC BY-NC-SA license

Document Score

0

Views 1
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?