Analisis Sentimen Multi-Label Toxic Comment Menggunakan Metode BERT-BiLSTM

Date
2023Author
Putri, Syarifah Kemala
Advisor(s)
Amalia
Abidin, Taufik Fuadi
Metadata
Show full item recordAbstract
The growth of internet users in Indonesia continues to increase every year, one of which is for expressing opinions in online forums such as Twitter, which has now been renamed as X. Social media has negative impacts, such as disseminating toxic opinions. Moreover, the volume of stored information also rises with the increasing number of active users expressing their opinions. Identifying toxic sentences can help recognize harmful content and limit its spread. Therefore, a method is needed to manage such text and extract information, such as classification.
However, more than simple classification is needed as it only separates comments into positive and negative categories. Multi-label classification is a method that can assign multiple labels to a single instance, allowing one statement to be grouped into several categories. This research utilizes a combination of BERT and BiLSTM methods. BERT is employed to obtain word vector values, which are then used as input in the BiLSTM model for multi-label classification tasks.
In this study, the researcher uses two types of word vectors: the sum of the last four hidden layers and the last hidden layer of BERT to achieve a better model comparison. The research attains an accuracy value of 0.994, precision of 0.997, recall of 0.994, and an F1 score of 0.996 in the model using the last four hidden layers of BERT as word vectors.
Collections
- Master Theses [620]