Sistem Pendeteksi Spam Pesan Singkat Bahasa Indonesia Menggunakan IndoBERT dan Multi-Graph Convolutional Network
Indonesian Short Message Spam Detection System Using IndoBERT and Multi-Graph Convolutional Network
Date
2026Author
Choiry, Abby Fakhri
Advisor(s)
Candra, Ade
Herriyance
Metadata
Show full item recordAbstract
Short Message Service (SMS) in Indonesia remains a primary medium for disseminating
unwanted spam content, ranging from aggressive commercial advertisements to
harmful financial fraud attempts that pose significant risks to users. The core challenge
in SMS spam detection lies in the dynamic linguistic complexity of the Indonesian
language, which frequently incorporates slang, inconsistent abbreviations, and code-
switching patterns that hinder traditional keyword-based filtering techniques. While
transformer-based language models such as IndoBERT have demonstrated exceptional
performance in semantic context understanding, they often struggle with structural text
obfuscation designed to bypass automated security systems. This research proposes an
innovative hybrid architecture that integrates the rich semantic representations of
IndoBERT with the structural analysis capabilities of a Multi-Graph Convolutional
Network (GCN). Through this approach, relational patterns between words are modeled
as graph representations to augment the contextual depth of the base model. This study
utilizes a curated dataset of Indonesian SMS messages to train and evaluate the
performance of the proposed hybrid model. Experimental results demonstrate
outstanding performance, with the model achieving a peak accuracy of 99.93% and an
F1-Score of 1.00. Ablation studies further confirm that incorporating graph-based
features significantly enhances detection precision and effectively minimizes false
positive rates compared to using the standalone language model. The findings of this
research provide significant theoretical contributions to the field of Natural Language
Processing and are practically implemented as a real-time detection prototype to
validate its robustness and feasibility in addressing spam threats within real-world
communication scenarios.
Collections
- Undergraduate Theses [1273]
