NUMA-Aware Strategies for the Heterogeneous Execution of SPMV on Modern Supercomputers

Latest revision as of 17:39, 11 March 2021

Abstract

The sparse matrix-vector product is a widespread operation amongst the scientific computing community. It represents the dominant computational cost in many large-scale simulations relying on iterative methods, and its performance is sensitive to the sparse pattern, the storage format and kernel implementation, and the target computing architecture. In this work, we are devoted to the efficient execution of the sparse matrix-vector product on (potentially hybrid) modern supercomputers with non-uniform memory access configurations. A hierarchical parallel implementation is proposed to minimise the number of processes participating in distributed-memory parallelisation. As a result, a single process per computing node is enough to engage all its hardware and ensure efficient memory access on manycore platforms. The benefits of this approach have been demonstrated on up to 9,600 cores of MareNostrum 4 supercomputer, at Barcelona Supercomputing Center.

Latest revision as of 17:39, 11 March 2021

Abstract

Full document

Document information

Document Score

Share this document

Keywords

claim authorship

Revision as of 17:39, 11 March 2021 (view source) Scipediacontent (talk \| contribs) (Created page with "== Abstract == The sparse matrix-vector product is a widespread operation amongst the scientific computing community. It represents the dominant computational cost in many la...")	Latest revision as of 17:39, 11 March 2021 (view source) Scipediacontent (talk \| contribs) m (Scipediacontent moved page Draft Content 721176048 to Alvarez-Farre et al 2021a)
(No difference)