Publication:

Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot Interaction

 
cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtualsource.department952936d6-a5d1-4952-ab8c-7ba5c377af16
cris.virtualsource.orcid952936d6-a5d1-4952-ab8c-7ba5c377af16
dc.contributor.authorHou, Yuanbo
dc.contributor.authorren, Qiaoqiao
dc.contributor.authorWang, Wenwu
dc.contributor.authorBotteldooren, Dick
dc.date.accessioned2026-06-17T10:16:16Z
dc.date.available2026-06-17T10:16:16Z
dc.date.createdwos2026-01-06
dc.date.issued2025
dc.description.abstractEmotion recognition and touch gesture decoding are crucial for advancing human-robot interaction (HRI), especially in social environments where emotional cues and tactile perception play important roles. However, many humanoid robots, such as Pepper, Nao, and Furhat, lack full-body tactile skin, limiting their ability to engage in touch-based emotional and gesture interactions. In addition, vision-based emotion recognition methods usually face strict GDPR compliance challenges due to the need to collect personal facial data. To address these limitations and avoid privacy issues, this paper studies the potential of using the sounds produced by touching during HRI to recognise tactile gestures and classify emotions along the arousal and valence dimensions. Using a dataset of tactile gestures and emotional interactions from 28 participants with the humanoid robot Pepper, we design an audio-only lightweight touch gesture and emotion recognition model with only 0.24M parameters, 0.94MB model size, and 0.7G FLOPs. Experimental results show that the proposed model effectively recognises the arousal and valence states of different emotions, as well as various tactile gestures, when the input audio length varies. The proposed model is of low-latency and achieves similar results as well-known pretrained audio neural networks (PANNs), but with much smaller FLOPs, number of parameters, and model size.
dc.description.wosFundingTextThis research received funding from the Flemish Government under the "Onderzoeksprogramma Artificiele Intelligentie (AI) Vlaanderen" programme.
dc.identifier.doi10.1109/icassp49660.2025.10890031
dc.identifier.isbn979-8-3503-6875-8
dc.identifier.issn1520-6149
dc.identifier.urihttps://imec-publications.be/handle/20.500.12860/59734
dc.language.isoeng
dc.provenance.editstepusergreet.vanhoof@imec.be
dc.publisherIEEE
dc.source.conferenceIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
dc.source.conferencedate2025-04-06
dc.source.conferencelocationHyderabad
dc.source.journal2025 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
dc.source.numberofpages5
dc.subject.keywordsCIRCUMPLEX MODEL
dc.subject.keywordsNEURAL-NETWORKS
dc.subject.keywordsPERCEPTION
dc.subject.keywordsVALENCE
dc.subject.keywordsAROUSAL
dc.title

Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot Interaction

dc.typeProceedings paper
dspace.entity.typePublication
imec.internal.crawledAt2026-04-07
imec.internal.sourcecrawler
imec.internal.wosCreatedAt2026-04-07
Files

Original bundle

Name:
Sound-Based_Recognition_of_Touch_Gestures_and_Emotions_for_Enhanced_Human-Robot_Interaction.pdf
Size:
8.78 MB
Format:
Adobe Portable Document Format
Description:
Published
Publication available in collections: