2025 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS
Abstract
The necessity of processing real-time data at the network edge is growing. Low-power AI accelerators, especially edge GPUs, help meet this demand by mitigating cloud-related latency and bandwidth issues. However, GPUs remain underutilised, even in heavy workloads, due to a limited understanding of resource sharing in edge computing. This work analyses key GPU metrics: utilisation, memory, streaming multiprocessors (SMs), and tensorcores on NVIDIA Jetson devices under concurrent vision-inference workloads. Our findings show that while GPU utilisation can reach 100 % with optimisations, SMs and tensor cores often run at only 15-30 % capacity.