Optimasi Sistem Pencarian Karya Ilmiah dari Repositori Institusi USU Berbasis Large Language Model (LLM) dengan Retrieval-Augmented Generation (RAG)
Optimization of the Scientific Paper Search System for the USU Institutional Repository Based on Large Language Models (LLM) with Retrieval-Augmented Generation (RAG)
Date
2026Author
Ramadan, Andrian Putra
Advisor(s)
Amalia
Tarigan, Jos Timanta
Metadata
Show full item recordAbstract
The Institutional Repository of Universitas Sumatera Utara (USU) currently uses a keyword-based search system that has limitations in understanding semantic context and variations of natural language queries, which often results in low relevance for users. This study aims to design and implement a scholarly works search system that is more efficient, relevant, and contextual by leveraging large language model (LLM) technology. The methodology applied in this research is a retrieval-augmented generation (RAG) approach that integrates the Gemini model to generate answers and Elasticsearch as a vector database. The system employs a self-query retrieval mechanism to translate user questions into structured queries that automatically combine semantic search with metadata filtering, and performs indexing on abstracts of scholarly works from the Faculty of Computer Science and Information Technology that have been translated into Indonesian. This approach is designed to address hallucination issues in LLMs while improving document search precision. Test results show that the system is able to understand user queries in Indonesian and present answers that match the context. The system performance evaluation produced a context precision metric score of 0.99 and a context recall of 0.99, indicating a high capability to rank relevant documents at the top. Overall, the integration of LLM and RAG has proven successful in overcoming the weaknesses of conventional search by providing a more intuitive and accurate search experience, although the system still has limitations in handling questions involving statistical aggregation.
Collections
- Undergraduate Theses [1273]
