Pendeteksian Kata-Kata Kasar di Media Sosial dengan Menggunakan IndoBERTdan TextCNN
Profanity Detection on Social Media Using IndoBERT and TextCNN

Date
2025Author
Sijabat, Adelweys Margaretha
Advisor(s)
Jaya, Ivan
Andayani, Ulfi
Metadata
Show full item recordAbstract
The high internet penetration rate in Indonesia, which reaches 79.5% of the population, has brought about positive impacts, but also various negative impacts, especially for children. Although Indonesians are known to be friendly in person, Microsoft's 2021 Digital Civility Index report shows that the level of digital civility among Indonesian netizens is very low in the virtual world, even the lowest in Southeast Asia. This online behavior and use of coarse language can have a detrimental effect on users' mental health, especially children. Although every social media platform has a minimum age requirement for its users, many children falsify their birth year to create accounts, even though they are not yet old enough. Currently, the government has initiated the Internet Sehat dan Aman (INSAN) program, aimed at creating a healthy online environment. However, the program is still ineffective, and technology-based solutions are still needed. To address this challenge, this study aims to develop a system that combines the IndoBERT and TextCNN algorithms to identify profanity on social media and replace profane words with the character “*” to minimize children's exposure to profanity. The system successfully identified and censored profanity—replacing it with the ‘*’ character—validated through testing with excellent performance, including 97.8% accuracy. The model was implemented using FlaskAPI to connect the website interface and Chrome extension with the model.
Collections
- Undergraduate Theses [858]