Implementasi Model DeBERTa untuk Prediksi Kompleksitas Kata Berbahasa Inggris

Sihombing, Johansen

Implementation of the DeBERTa Model for English Word Complexity Prediction

View/Open

Cover (877.7Kb)

Fulltext (5.418Mb)

Date

2025

Author

Sihombing, Johansen

Advisor(s)

Arisandi, Dedy

Purnamawati, Sarah

Metadata

Show full item record

Abstract

Word complexity in English texts poses a significant challenge in the field of Natural Language Processing (NLP), particularly for the development of automatic text simplification systems and effective second language learning support tools. Language learners' comprehension is often hindered by highly complex words. This study aims to develop and evaluate an English word complexity prediction system using DeBERTa (Decoding-enhanced BERT with Disentangled Attention), a Transformer model renowned for its superior contextual representation. The model was trained and tested on a dataset comprising 8,554 word entries, compiled from the Complex dataset and augmented with data from the Oxford Dictionary. Evaluation results demonstrated excellent predictive performance, achieving a Mean Squared Error (MSE) of 0.0036, a Mean Absolute Error (MAE) of 0.0402, and a Pearson correlation of 0.9770 on the test set. These findings indicate that the DeBERTa model possesses high accuracy and robust generalization capabilities in assessing word complexity across diverse text domains, highlighting its significant potential for advancing NLP applications concerned with word complexity analysis and processing.

URI

https://repositori.usu.ac.id/handle/123456789/105307

Collections

Undergraduate Theses [873]