Deteksi Deepfake pada Wajah dalam Video dengan Cross-Attention Multi-Scale Vision Transformer dan EfficientNet

Manurung, Gery Jonathan

Deteksi Deepfake pada Wajah dalam Video dengan Cross-Attention Multi-Scale Vision Transformer dan EfficientNet

dc.contributor.advisor	Nasution, Umaya Ramadhani Putri
dc.contributor.advisor	Sawaluddin
dc.contributor.author	Manurung, Gery Jonathan
dc.date.accessioned	2025-07-19T07:42:43Z
dc.date.available	2025-07-19T07:42:43Z
dc.date.issued	2025
dc.identifier.uri	https://repositori.usu.ac.id/handle/123456789/105815
dc.description.abstract	The proliferation of sophisticated deepfake videos poses a serious threat to digital trust and security, demanding detection systems that are not only accurate but also computationally efficient for practical applications. This research aims to design, implement, and evaluate a hybrid architecture that balances high accuracy with inference efficiency for video-based deepfake detection. The proposed model integrates EfficientNet-B1 as a feature extractor with a Cross-Attention Multi-Scale Vision Transformer (Cross-ViT) for context modeling. The model was trained on a combination of the FaceForensics++ and Celeb-DF(v2) datasets and evaluated on an out-of-distribution dataset, DeepFakeDetection (DFD), to test its generalization capabilities. The evaluation results demonstrate reliable detection performance, achieving an Area Under the Curve (AUC) of 92.35% and a video-level F1-score of 83.62%. The model's primary advantage is its exceptional computational efficiency, requiring only 0.349 G-FLOP for per-frame inference, despite having a large parameter count (114.33 Million). This study also reveals that the use of a small batch size, Face-Cutout augmentation, and a Binary Cross-Entropy (BCE) loss function significantly contributes to improved generalization and effective video-level aggregation. This research successfully validates an efficient and scalable hybrid architecture that offers a practical solution for deepfake detection by balancing accuracy, inference speed, and model size.	en_US
dc.language.iso	id	en_US
dc.publisher	Universitas Sumatera Utara	en_US
dc.subject	Deepfake Detection	en_US
dc.subject	Vision Transformer	en_US
dc.subject	Cross-ViT	en_US
dc.subject	EfficientNet	en_US
dc.subject	Computational Efficiency	en_US
dc.subject	Deep Learning	en_US
dc.title	Deteksi Deepfake pada Wajah dalam Video dengan Cross-Attention Multi-Scale Vision Transformer dan EfficientNet	en_US
dc.title.alternative	Deepfake Detection on Faces in Video with Cross-Attention Multi-Scale Vision Transformer and EfficientNet	en_US
dc.type	Thesis	en_US
dc.identifier.nim	NIM211402137
dc.identifier.nidn	NIDN0011049114
dc.identifier.nidn	NIDN0031125982
dc.identifier.kodeprodi	KODEPRODI59201#Teknologi Informasi
dc.description.pages	85 Pages	en_US
dc.description.type	Skripsi Sarjana	en_US
dc.subject.sdgs	SDGs 16. Peace, Justice And Strong Institutions	en_US

Files in this item

Name:: Deteksi Deepfake pada Wajah dalam ...
Size:: 655.7Kb
Format:: PDF
Description:: Cover

View/Open

Name:: Gery Jonathan Manurung_Deteksi ...
Size:: 3.130Mb
Format:: PDF
Description:: Fulltext

View/Open

This item appears in the following Collection(s)

Undergraduate Theses [873]
Skripsi Sarjana

Show simple item record