Quantized Dynamics Models for Hardware-Efficient Control and Planning in Model-Based RL

Satya Murthy, Nitish; Catthoor, Francky; Verhelst, Marian; Vrancx, Peter

doi:10.1007/978-3-031-74643-7_16

Simple item page Full metadata Statistics

dc.contributor.author	Satya Murthy, Nitish
dc.contributor.author	Catthoor, Francky
dc.contributor.author	Verhelst, Marian
dc.contributor.author	Vrancx, Peter
dc.contributor.imecauthor	Murthy, Nitish Satya
dc.contributor.imecauthor	Catthoor, Francky
dc.contributor.imecauthor	Verhelst, Marian
dc.contributor.imecauthor	Vrancx, Peter
dc.contributor.orcidimec	Catthoor, Francky::0000-0002-3599-8515
dc.contributor.orcidimec	Verhelst, Marian::0000-0003-3495-9263
dc.contributor.orcidimec	Vrancx, Peter::0000-0002-9876-3684
dc.date.accessioned	2025-04-15T04:20:28Z
dc.date.available	2025-04-15T04:20:28Z
dc.date.issued	2025
dc.description.abstract	Reinforcement learning (RL) algorithms can help solve continuous control tasks in robotics. Model-based RL in particular has shown promise for these applications. It can be orders of magnitude more sample-efficient than its model-free counter-part by utilizing Deep Neural Network (DNN) based dynamics models. Furthermore, model-based methods are more robust and more generalizable due to being reward-agnostic. However, the computational complexities involved in planning and control in model-based RL are much higher, causing challenges in real-time deployment on low-resource hardware at the edge. In our work, we focus on reducing the computational footprint of the dynamics models used in model-based RL. To make the algorithm more hardware-efficient, we introduce block floating point data types for DNNs with optimized block performance and different mixed-precision network configurations. The performance impact of these optimizations is assessed across benchmark continuous robotics control tasks. A total memory savings of can be achieved over conventional FP32 networks while reaching comparable rewards in most scenarios.
dc.description.wosFundingText	This research received funding from the Flemish Government (AI Research Program).
dc.identifier.doi	10.1007/978-3-031-74643-7_16
dc.identifier.eisbn	978-3-031-74643-7
dc.identifier.isbn	978-3-031-74642-0
dc.identifier.issn	1865-0929
dc.identifier.uri	https://imec-publications.be/handle/20.500.12860/45534
dc.publisher	SPRINGER INTERNATIONAL PUBLISHING AG
dc.source.beginpage	196
dc.source.conference	8th European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
dc.source.conferencedate	2023-09-18
dc.source.conferencelocation	Turin
dc.source.endpage	209
dc.source.journal	Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023)
dc.source.numberofpages	14
dc.title	Quantized Dynamics Models for Hardware-Efficient Control and Planning in Model-Based RL
dc.type	Proceedings paper
dspace.entity.type	Publication
Files
Publication available in collections:	Conference contributions

Quantized Dynamics Models for Hardware-Efficient Control and Planning in Model-Based RL

Date