How to Keep Pushing ML Accelerator Performance? Know Your Rooflines!

Verhelst, Marian; Benini, Luca; Verma, Naveen

doi:10.1109/JSSC.2025.3553765

Simple item page Full metadata Statistics

dc.contributor.author	Verhelst, Marian
dc.contributor.author	Benini, Luca
dc.contributor.author	Verma, Naveen
dc.contributor.imecauthor	Verhelst, Marian
dc.contributor.orcidimec	Verhelst, Marian::0000-0003-3495-9263
dc.date.accessioned	2025-05-10T05:36:12Z
dc.date.available	2025-05-10T05:36:12Z
dc.date.issued	2025
dc.description.abstract	The rapidly growing importance of machine learning (ML) applications, coupled with their ever-increasing model size and inference energy footprint, has created a strong need for specialized ML hardware architectures. Numerous ML accelerators have been explored and implemented, primarily to increase task-level throughput per unit area and reduce task-level energy consumption. This article surveys key trends toward these objectives for more efficient ML accelerators and provides a unifying framework to understand how compute and memory technologies/architectures interact to enhance system-level efficiency and performance. To achieve this, this article introduces an enhanced version of the roofline model and applies it to ML accelerators as an effective tool for understanding where various execution regimes fall within roofline bounds and how to maximize performance and efficiency under the roofline. Key concepts are illustrated with examples from state-of-the-art (SOTA) designs, with a view toward open research opportunities to further advance accelerator performance.
dc.description.wosFundingText	This work was supported in part by European Research Council (ERC); in part by the Flanders Artificial Intelligence (AI) Research (FAIR) Program; and in part by the Swiss State Secretariat for Education, Research, and Innovation (SERI) through the SwissChips Initiative and Defense Advanced Research Projects Agency (DARPA) Optimal Processing Technology Inside Memory Arrays (OPTIMA) under Grant HR00112490300.
dc.identifier.doi	10.1109/JSSC.2025.3553765
dc.identifier.issn	0018-9200
dc.identifier.uri	https://imec-publications.be/handle/20.500.12860/45618
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.source.beginpage	1888
dc.source.endpage	1905
dc.source.issue	6
dc.source.journal	IEEE JOURNAL OF SOLID-STATE CIRCUITS
dc.source.numberofpages	18
dc.source.volume	60
dc.title	How to Keep Pushing ML Accelerator Performance? Know Your Rooflines!
dc.type	Journal article
dspace.entity.type	Publication
Files	Original bundle Name: How_to_Keep_Pushing_ML_Accelerator_Performance_Know_Your_Rooflines.pdf Size: 5.11 MB Format: Adobe Portable Document Format Description: Published Download
Publication available in collections:	Articles

How to Keep Pushing ML Accelerator Performance? Know Your Rooflines!

Date