Show simple item record

dc.contributor.advisorNasution, Umaya Ramadhani Putri
dc.contributor.advisorSawaluddin
dc.contributor.authorManurung, Gery Jonathan
dc.date.accessioned2025-07-19T07:42:43Z
dc.date.available2025-07-19T07:42:43Z
dc.date.issued2025
dc.identifier.urihttps://repositori.usu.ac.id/handle/123456789/105815
dc.description.abstractThe proliferation of sophisticated deepfake videos poses a serious threat to digital trust and security, demanding detection systems that are not only accurate but also computationally efficient for practical applications. This research aims to design, implement, and evaluate a hybrid architecture that balances high accuracy with inference efficiency for video-based deepfake detection. The proposed model integrates EfficientNet-B1 as a feature extractor with a Cross-Attention Multi-Scale Vision Transformer (Cross-ViT) for context modeling. The model was trained on a combination of the FaceForensics++ and Celeb-DF(v2) datasets and evaluated on an out-of-distribution dataset, DeepFakeDetection (DFD), to test its generalization capabilities. The evaluation results demonstrate reliable detection performance, achieving an Area Under the Curve (AUC) of 92.35% and a video-level F1-score of 83.62%. The model's primary advantage is its exceptional computational efficiency, requiring only 0.349 G-FLOP for per-frame inference, despite having a large parameter count (114.33 Million). This study also reveals that the use of a small batch size, Face-Cutout augmentation, and a Binary Cross-Entropy (BCE) loss function significantly contributes to improved generalization and effective video-level aggregation. This research successfully validates an efficient and scalable hybrid architecture that offers a practical solution for deepfake detection by balancing accuracy, inference speed, and model size.en_US
dc.language.isoiden_US
dc.publisherUniversitas Sumatera Utaraen_US
dc.subjectDeepfake Detectionen_US
dc.subjectVision Transformeren_US
dc.subjectCross-ViTen_US
dc.subjectEfficientNeten_US
dc.subjectComputational Efficiencyen_US
dc.subjectDeep Learningen_US
dc.titleDeteksi Deepfake pada Wajah dalam Video dengan Cross-Attention Multi-Scale Vision Transformer dan EfficientNeten_US
dc.title.alternativeDeepfake Detection on Faces in Video with Cross-Attention Multi-Scale Vision Transformer and EfficientNeten_US
dc.typeThesisen_US
dc.identifier.nimNIM211402137
dc.identifier.nidnNIDN0011049114
dc.identifier.nidnNIDN0031125982
dc.identifier.kodeprodiKODEPRODI59201#Teknologi Informasi
dc.description.pages85 Pagesen_US
dc.description.typeSkripsi Sarjanaen_US
dc.subject.sdgsSDGs 16. Peace, Justice And Strong Institutionsen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record