Satya Murthy, NitishNitishSatya MurthyCatthoor, FranckyFranckyCatthoorVerhelst, MarianMarianVerhelstVrancx, PeterPeterVrancx2025-04-152025-04-152025978-3-031-74642-01865-0929WOS:001437452700016https://imec-publications.be/handle/20.500.12860/45534Reinforcement learning (RL) algorithms can help solve continuous control tasks in robotics. Model-based RL in particular has shown promise for these applications. It can be orders of magnitude more sample-efficient than its model-free counter-part by utilizing Deep Neural Network (DNN) based dynamics models. Furthermore, model-based methods are more robust and more generalizable due to being reward-agnostic. However, the computational complexities involved in planning and control in model-based RL are much higher, causing challenges in real-time deployment on low-resource hardware at the edge. In our work, we focus on reducing the computational footprint of the dynamics models used in model-based RL. To make the algorithm more hardware-efficient, we introduce block floating point data types for DNNs with optimized block performance and different mixed-precision network configurations. The performance impact of these optimizations is assessed across benchmark continuous robotics control tasks. A total memory savings of can be achieved over conventional FP32 networks while reaching comparable rewards in most scenarios.Quantized Dynamics Models for Hardware-Efficient Control and Planning in Model-Based RLProceedings paper10.1007/978-3-031-74643-7_16978-3-031-74643-7WOS:001437452700016