Chameleon: A Multimodal Learning Framework Robust to Missing Modalities

Liaqat, Muhammad Irzam; Nawaz, Shah; Zaheer, Muhammad Zaigham; Saeed, Muhammad Saad; Sajjad, Hassan; De Schepper, Tom; Nandakumar, Karthik; Khan, Muhammad Haris; Gallo, Ignazio; Schedl, Markus

doi:10.1007/s13735-025-00370-y

Simple item page Full metadata Statistics

dc.contributor.author	Liaqat, Muhammad Irzam
dc.contributor.author	Nawaz, Shah
dc.contributor.author	Zaheer, Muhammad Zaigham
dc.contributor.author	Saeed, Muhammad Saad
dc.contributor.author	Sajjad, Hassan
dc.contributor.author	De Schepper, Tom
dc.contributor.author	Nandakumar, Karthik
dc.contributor.author	Khan, Muhammad Haris
dc.contributor.author	Gallo, Ignazio
dc.contributor.author	Schedl, Markus
dc.contributor.imecauthor	De Schepper, Tom
dc.contributor.orcidimec	De Schepper, Tom::0000-0002-2969-3133
dc.date.accessioned	2025-06-06T04:50:09Z
dc.date.available	2025-06-06T04:50:09Z
dc.date.issued	2025
dc.description.abstract	Multimodal learning has demonstrated remarkable performance improvements over unimodal architectures. However, multimodal learning methods often exhibit deteriorated performances if one or more modalities are missing. This may be attributed to the commonly used multi-branch design containing modality-specific components, making such approaches reliant on the availability of a complete set of modalities. In this work, we propose a robust multimodal learning framework, Chameleon, that adapts a common-space visual learning network to align all input modalities. To enable this, we present the unification of input modalities into one format by encoding any non-visual modality into visual representations thus making it robust to missing modalities. Extensive experiments are performed on multimodal classification task using four textual-visual (Hateful Memes, UPMC Food-101, MM-IMDb, and Ferramenta) and two audio-visual (avMNIST, VoxCeleb) datasets. Chameleon not only achieves superior performance when all modalities are present at train/test time but also demonstrates notable resilience in the case of missing modalities.
dc.description.wosFundingText	This research was funded in whole or in part by the Austrian Science Fund (FWF): https://doi.org/10.55776/COE12, https://doi.org/10.55776/DFH23, https://doi.org/10.55776/P36413.
dc.identifier.doi	10.1007/s13735-025-00370-y
dc.identifier.issn	2192-6611
dc.identifier.uri	https://imec-publications.be/handle/20.500.12860/45761
dc.publisher	SPRINGER
dc.source.beginpage	21
dc.source.issue	2
dc.source.journal	INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL
dc.source.numberofpages	14
dc.source.volume	14
dc.title	Chameleon: A Multimodal Learning Framework Robust to Missing Modalities
dc.type	Journal article
dspace.entity.type	Publication
Files	Original bundle Name: s13735-025-00370-y.pdf Size: 3.96 MB Format: Adobe Portable Document Format Description: Published Download
Publication available in collections:	Articles

Chameleon: A Multimodal Learning Framework Robust to Missing Modalities

Date