Implementasi Fine-Tuning BERT untuk Intent Classification dan Integrasi Retrieval-Augmented Generation (RAG) pada Chatbot Layanan Akademik di Universitas Sumatera Utara
Implementation of Fine-Tuning BERT for Intent Classification and Integration of Retrieval-Augmented Generation (RAG) in an Academic Service Chatbot at Universitas Sumatera Utara

Date
2025Author
Purba, M Iqbal
Advisor(s)
Amalia
Harumy, T Henny Febriana
Metadata
Show full item recordAbstract
Providing fast, accurate, and efficient academic information remains a major challenge for educational institutions, including Universitas Sumatera Utara (USU). Although USU has developed several information platforms, these systems are not yet optimally integrated, resulting in students having to access multiple separate sources. Furthermore, the lack of a system capable of responding to student inquiries in real time often leads students to contact academic staff directly, which in turn reduces service efficiency and increases administrative workload. This study aims to develop an Artificial Intelligence (AI)-based chatbot system that can integrate academic information at USU, enabling students to access the information they need quickly and easily. The chatbot utilizes intent classification techniques with IndoBERT and a Retrieval-Augmented Generation (RAG) method to provide accurate and relevant answers and respond to questions in real time. The dataset used consists of two parts: first, data for the intent classification task, which was automatically generated with the help of an AI model; and second, a knowledge base compiled from official USU sources, such as websites and official documents. Evaluation results show that the fine-tuned IndoBERT model for intent classification achieved an accuracy, precision, recall, and F1-score of 97%. Meanwhile, the RAG process was evaluated using BLEU and ROUGE metrics, achieving a BLEU score of 87%, ROUGE-1 of 90%, ROUGE-2 of 88%, and ROUGE-L of 90%. In addition, user feedback on the responses generated by the RAG process indicated fluency of 99.6%, relevance of 95.68%, and completeness of 91.6%. The results show that the integration of IndoBERT for intent classification and RAG is effective in providing real-time academic information at USU.
Collections
- Undergraduate Theses [1235]