Publication:
Child Speech Recognition in Human-Robot Interaction: Problem Solved?
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.department | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtual.orcid | 0000-0001-6301-0028 | |
| cris.virtual.orcid | 0009-0009-4734-1551 | |
| cris.virtual.orcid | 0000-0002-1969-8395 | |
| cris.virtual.orcid | 0000-0002-1790-9531 | |
| cris.virtual.orcid | 0000-0001-5207-7745 | |
| cris.virtual.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
| cris.virtualsource.department | ab1b156b-2cca-4ddc-bdb9-155273f95966 | |
| cris.virtualsource.department | f4e3a95e-1307-4c80-948b-185fb3c7b52d | |
| cris.virtualsource.department | a5da3e80-8bca-4be5-bbbc-6737b80b585f | |
| cris.virtualsource.department | 60910c8d-eace-48b6-8e4d-3c2fff94428a | |
| cris.virtualsource.department | 6c1aac4b-593e-4f80-9ecc-911fd20f3c31 | |
| cris.virtualsource.department | 952936d6-a5d1-4952-ab8c-7ba5c377af16 | |
| cris.virtualsource.orcid | ab1b156b-2cca-4ddc-bdb9-155273f95966 | |
| cris.virtualsource.orcid | f4e3a95e-1307-4c80-948b-185fb3c7b52d | |
| cris.virtualsource.orcid | a5da3e80-8bca-4be5-bbbc-6737b80b585f | |
| cris.virtualsource.orcid | 60910c8d-eace-48b6-8e4d-3c2fff94428a | |
| cris.virtualsource.orcid | 6c1aac4b-593e-4f80-9ecc-911fd20f3c31 | |
| cris.virtualsource.orcid | 952936d6-a5d1-4952-ab8c-7ba5c377af16 | |
| dc.contributor.author | Janssens, Ruben | |
| dc.contributor.author | Verhelst, Eva | |
| dc.contributor.author | Abbo, Giulio Antonio | |
| dc.contributor.author | ren, Qiaoqiao | |
| dc.contributor.author | Pinto Bernal, Maria Jose | |
| dc.contributor.author | Belpaeme, Tony | |
| dc.contributor.imecauthor | Janssens, Ruben | |
| dc.contributor.imecauthor | Verhelst, Eva | |
| dc.contributor.imecauthor | Abbo, Giulio Antonio | |
| dc.contributor.imecauthor | Ren, Qiaoqiao | |
| dc.contributor.imecauthor | Bernal, Maria Jose Pinto | |
| dc.contributor.imecauthor | Belpaeme, Tony | |
| dc.contributor.orcidimec | Janssens, Ruben::0000-0002-1790-9531 | |
| dc.contributor.orcidimec | Verhelst, Eva::0009-0009-4734-1551 | |
| dc.contributor.orcidimec | Abbo, Giulio Antonio::0000-0001-6301-0028 | |
| dc.contributor.orcidimec | Belpaeme, Tony::0000-0001-5207-7745 | |
| dc.date.accessioned | 2025-08-23T03:59:02Z | |
| dc.date.available | 2025-08-23T03:59:02Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Automated Speech Recognition shows superhuman performance for adult English speech on a range of benchmarks, but disappoints when fed children’s speech. This has long sat in the way of child-robot interaction. Recent evolutions in data-driven speech recognition, including the availability of Transformer architectures and unprecedented volumes of training data, might mean a breakthrough for child speech recognition and social robot applications aimed at children. We revisit a study on child speech recognition from 2017 and show that indeed performance has increased, with newcomer OpenAI Whisper doing markedly better than leading commercial cloud services. Performance improves even more in highly structured interactions when priming models with specific phrases. While transcription is not perfect yet, the best model recognises 60.3% of sentences correctly barring small grammatical differences, with sub-second transcription time running on a local GPU, showing potential for usable autonomous child-robot speech interactions. | |
| dc.description.wosFundingText | This research received funding from imec (Smart Education), the Flemish Government (AI Research Program) and the Horizon Europe VALAWAI project (grant agreement number 101070930). We are indebted to the authors of [9] and [8] for making the recordings and transcriptions available. | |
| dc.identifier.doi | 10.1007/978-981-96-3519-1_43 | |
| dc.identifier.eisbn | 978-981-96-3519-1 | |
| dc.identifier.isbn | 978-981-96-3518-4 | |
| dc.identifier.issn | 2945-9133 | |
| dc.identifier.uri | https://imec-publications.be/handle/20.500.12860/46098 | |
| dc.publisher | SPRINGER-VERLAG SINGAPORE PTE LTD | |
| dc.source.beginpage | 476 | |
| dc.source.conference | 16th International Conference on Social Robotics-ICSR-Empowering Humanity: The Role of Social and Collaborative Robotics in Shaping Our Future | |
| dc.source.conferencedate | 2024-10-24 | |
| dc.source.conferencelocation | Odense | |
| dc.source.endpage | 486 | |
| dc.source.journal | Social Robotics | |
| dc.source.numberofpages | 11 | |
| dc.title | Child Speech Recognition in Human-Robot Interaction: Problem Solved? | |
| dc.type | Proceedings paper | |
| dspace.entity.type | Publication | |
| Files | ||
| Publication available in collections: |