AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

Latest revision as of 00:02, 29 January 2021

Abstract

CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels.

Original document

The different versions of the original document can be found in:

http://arxiv.org/abs/1809.07683

http://arxiv.org/pdf/1809.07683

http://xplorestaging.ieee.org/ielx7/8430031/8465567/08465940.pdf?arnumber=8465940,

http://dx.doi.org/10.1109/dac.2018.8465940

https://dblp.uni-trier.de/db/conf/dac/dac2018.html#CongWYZ18,

https://arxiv.org/pdf/1809.07683.pdf,

https://dl.acm.org/citation.cfm?doid=3195970.3195999,

https://dl.acm.org/doi/pdf/10.1145/3195970.3195999,

https://arxiv.org/abs/1809.07683,

http://export.arxiv.org/pdf/1809.07683,

http://export.arxiv.org/abs/1809.07683,

https://fr.arxiv.org/pdf/1809.07683,

https://uk.arxiv.org/abs/1809.07683,

https://fr.arxiv.org/abs/1809.07683,

https://academic.microsoft.com/#/detail/2809170821

https://dl.acm.org/doi/pdf/10.1145/3195970.3195999,

http://dx.doi.org/10.1145/3195970.3195999 under the license http://www.acm.org/publications/policies/copyright_policy#Background

DOIS: 10.1145/3195970.3195999 10.1109/dac.2018.8465940

Latest revision as of 00:02, 29 January 2021

Abstract

Original document

Document information

Document Score

Share this document

Keywords

claim authorship

Revision as of 00:02, 29 January 2021 (view source) Scipediacontent (talk \| contribs) (Created page with " == Abstract == CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in tod...")	Latest revision as of 00:02, 29 January 2021 (view source) Scipediacontent (talk \| contribs) m (Scipediacontent moved page Draft Content 830910680 to Zhang et al 2018e)
(No difference)