Publication:

I Was Blind but Now I See: Implementing Vision-Enabled Dialogue in Social Robots

 
dc.contributor.authorAbbo, Giulio Antonio
dc.contributor.authorBelpaeme, Tony
dc.date.accessioned2026-03-24T10:51:51Z
dc.date.available2026-03-24T10:51:51Z
dc.date.createdwos2025-10-29
dc.date.issued2025
dc.description.abstractIn the rapidly evolving landscape of human-robot interaction, the integration of vision capabilities into conversational agents stands as a crucial advancement. This paper presents a ready-to-use implementation of a dialogue manager that leverages the latest progress in Large Language Models (e.g., GPT-4o mini) to enhance the traditional text-based prompts with real-time visual input. LLMs are used to interpret both textual prompts and visual stimuli, creating a more contextually aware conversational agent. The system's prompt engineering, incorporating dialogue with summarisation of the images, en-sures a balance between context preservation and computational efficiency. Six interactions with a Furhat robot powered by this system are reported, illustrating and discussing the results obtained. The system can be customised and is available as a stand-alone application, a Furhat robot implementation, and a ROS2 package.
dc.description.wosFundingTextFunded by Horizon Europe VALAWAI (grant agreement 101070930).
dc.identifier.doi10.1109/HRI61500.2025.10973830
dc.identifier.isbn979-8-3503-7894-8
dc.identifier.issn2167-2121
dc.identifier.urihttps://imec-publications.be/handle/20.500.12860/58926
dc.language.isoeng
dc.provenance.editstepusergreet.vanhoof@imec.be
dc.publisherIEEE
dc.source.beginpage1176
dc.source.conference20th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
dc.source.conferencedate2025-03-04
dc.source.conferencelocationMelbourne
dc.source.endpage1180
dc.source.journal2025 20TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI
dc.source.numberofpages5
dc.title

I Was Blind but Now I See: Implementing Vision-Enabled Dialogue in Social Robots

dc.typeProceedings paper
dspace.entity.typePublication
imec.internal.crawledAt2025-10-22
imec.internal.sourcecrawler
Files
Publication available in collections: