m (Move page script moved page Ruda et al 1970a to Ruda et al 2022a)
 
(3 intermediate revisions by one other user not shown)
Line 3: Line 3:
  
 
Graphics cards that are equipped with Tensor Core units designed for AI applications, for example the NVIDIA Ampere A100, promise very high peak rates concerning their computing power (156 TFLOP/s in single and 312 TFLOP/s in half precision in the case of the A100). This is only achieved when performing arithmetically intensive operations such as dense matrix multiplications in the aforementioned lower precision, which is an obstacle when trying to use this hardware for solving linear systems arising from PDEs discretized with the finite element method. In previous works, we delivered a proof of concept that the predecessor of the A100, the V100 and its Tensor Cores, can be exploited to a great extent when solving Poisson's equation on the unit square if a hardware-oriented direct solver based on prehandling via hierarchical finite elements and a Schur complement approach is used. In this work, using numerical results on an A100 graphics card, we show that the method also achieves a very high performance if Poisson's equation, which is discretized by linear finite elements, is solved on a more complex domain corresponding to a flow around a square configuration.
 
Graphics cards that are equipped with Tensor Core units designed for AI applications, for example the NVIDIA Ampere A100, promise very high peak rates concerning their computing power (156 TFLOP/s in single and 312 TFLOP/s in half precision in the case of the A100). This is only achieved when performing arithmetically intensive operations such as dense matrix multiplications in the aforementioned lower precision, which is an obstacle when trying to use this hardware for solving linear systems arising from PDEs discretized with the finite element method. In previous works, we delivered a proof of concept that the predecessor of the A100, the V100 and its Tensor Cores, can be exploited to a great extent when solving Poisson's equation on the unit square if a hardware-oriented direct solver based on prehandling via hierarchical finite elements and a Schur complement approach is used. In this work, using numerical results on an A100 graphics card, we show that the method also achieves a very high performance if Poisson's equation, which is discretized by linear finite elements, is solved on a more complex domain corresponding to a flow around a square configuration.
 +
 +
== Abstract ==
 +
<pdf>Media:Draft_Sanchez Pinedo_2520567641634_abstract.pdf</pdf>
 +
 +
== Full Paper ==
 +
<pdf>Media:Draft_Sanchez Pinedo_2520567641634_paper.pdf</pdf>

Latest revision as of 16:06, 25 November 2022

Summary

Graphics cards that are equipped with Tensor Core units designed for AI applications, for example the NVIDIA Ampere A100, promise very high peak rates concerning their computing power (156 TFLOP/s in single and 312 TFLOP/s in half precision in the case of the A100). This is only achieved when performing arithmetically intensive operations such as dense matrix multiplications in the aforementioned lower precision, which is an obstacle when trying to use this hardware for solving linear systems arising from PDEs discretized with the finite element method. In previous works, we delivered a proof of concept that the predecessor of the A100, the V100 and its Tensor Cores, can be exploited to a great extent when solving Poisson's equation on the unit square if a hardware-oriented direct solver based on prehandling via hierarchical finite elements and a Schur complement approach is used. In this work, using numerical results on an A100 graphics card, we show that the method also achieves a very high performance if Poisson's equation, which is discretized by linear finite elements, is solved on a more complex domain corresponding to a flow around a square configuration.

Abstract

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document

Full Paper

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document
Back to Top
GET PDF

Document information

Published on 24/11/22
Accepted on 24/11/22
Submitted on 24/11/22

Volume Science Computing, 2022
DOI: 10.23967/eccomas.2022.292
Licence: CC BY-NC-SA license

Document Score

0

Views 34
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?