Publication:

GenConViT: Deepfake Video Detection Using Generative Convolutional Vision Transformer

Date

 
cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.orcid0000-0001-9530-3466
cris.virtual.orcid0000-0002-0660-3190
cris.virtual.orcid0000-0001-5313-4158
cris.virtualsource.department9ce8a40b-7b40-4a10-8d69-c12150ced91d
cris.virtualsource.department932d8640-1cb6-404a-8aad-f92227775c6e
cris.virtualsource.department817f2cb4-7e19-4453-9161-77eae0680f92
cris.virtualsource.departmentd9dcf0ec-40cb-4b14-9140-ec6762bb40e4
cris.virtualsource.orcid9ce8a40b-7b40-4a10-8d69-c12150ced91d
cris.virtualsource.orcid932d8640-1cb6-404a-8aad-f92227775c6e
cris.virtualsource.orcid817f2cb4-7e19-4453-9161-77eae0680f92
cris.virtualsource.orcidd9dcf0ec-40cb-4b14-9140-ec6762bb40e4
dc.contributor.authorDeressa, Deressa Wodajo
dc.contributor.authorMareen, Hannes
dc.contributor.authorLambert, Peter
dc.contributor.authorAtnafu, Solomon
dc.contributor.authorAkhtar, Zahid
dc.contributor.authorVan Wallendael, Glenn
dc.contributor.imecauthorDeressa, Deressa Wodajo
dc.contributor.imecauthorMareen, Hannes
dc.contributor.imecauthorLambert, Peter
dc.contributor.imecauthorVan Wallendael, Glenn
dc.contributor.orcidimecMareen, Hannes::0000-0002-0660-3190
dc.contributor.orcidimecLambert, Peter::0000-0001-5313-4158
dc.contributor.orcidimecVan Wallendael, Glenn::0000-0001-9530-3466
dc.date.accessioned2025-06-30T10:32:09Z
dc.date.available2025-06-30T03:57:08Z
dc.date.available2025-06-30T10:32:09Z
dc.date.issued2025
dc.description.abstractDeepfakes have raised significant concerns due to their potential to spread false information and compromise the integrity of digital media. Current deepfake detection models often struggle to generalize across a diverse range of deepfake generation techniques and video content. In this work, we propose a Generative Convolutional Vision Transformer (GenConViT) for deepfake video detection. Our model combines ConvNeXt and Swin Transformer models for feature extraction, and it utilizes an Autoencoder and Variational Autoencoder to learn from latent data distributions. By learning from the visual artifacts and latent data distribution, GenConViT achieves an improved performance in detecting a wide range of deepfake videos. The model is trained and evaluated on DFDC, FF++, TM, DeepfakeTIMIT, and Celeb-DF (v2) datasets. The proposed GenConViT model demonstrates strong performance in deepfake video detection, achieving high accuracy across the tested datasets. While our model shows promising results in deepfake video detection by leveraging visual and latent features, we demonstrate that further work is needed to improve its generalizability when encountering out-of-distribution data. Our model provides an effective solution for identifying a wide range of fake videos while preserving the integrity of media.
dc.description.wosFundingTextThis research was funded by Addis Ababa University Research Grant for the Adaptive Problem-Solving Research. Reference number RD/PY-183/2021. Grant number AR/048/2021, and the Research Foundation-Flanders (FWO under project grant G0A2523N), the Flemish government (COM-PRESS project, within the relanceplan Vlaamse Veerkracht), IDLab (Ghent University-imec), Flanders Innovation and Entrepreneurship (VLAIO), and the European Union.
dc.identifier.doi10.3390/app15126622
dc.identifier.issn2076-3417
dc.identifier.urihttps://imec-publications.be/handle/20.500.12860/45865
dc.publisherMDPI
dc.source.beginpage1
dc.source.endpage21
dc.source.issue12
dc.source.journalAPPLIED SCIENCES-BASEL
dc.source.numberofpages21
dc.source.volume15
dc.title

GenConViT: Deepfake Video Detection Using Generative Convolutional Vision Transformer

dc.typeJournal article
dspace.entity.typePublication
Files

Original bundle

Name:
DS909.pdf
Size:
5.91 MB
Format:
Adobe Portable Document Format
Description:
Published
Publication available in collections: