• Login
    View Item 
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Master Theses
    • View Item
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Master Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Peningkatan Akurasi Long Short-Term Memori (LSTM) Menggunakan Word2Vec dan Fastext untuk Machine Translation Bahasa Batak-Inggris

    Improving The Accuracy of Long Short-Term Memory (LSTM) Using Word2Vec and Fastext for Batak-English Machine Translation

    Thumbnail
    View/Open
    Cover (812.4Kb)
    Fulltext (1.723Mb)
    Date
    2024
    Author
    Nasution, Nur Amalia
    Advisor(s)
    Nababan, Erna Budhiarti
    Mawengkang, Herman
    Metadata
    Show full item record
    Abstract
    This research examines the performance of the Long Short-Term Memory (LSTM) algorithm in combination with two word embedding techniques, FastText and Word2Vec, for translating text between the Batak and English languages. LSTM, an advanced form of Recurrent Neural Networks (RNNs), is utilized for its capability to handle sequential data and maintain long-term dependencies. However, LSTM's effectiveness in translation tasks is significantly influenced by the quality of word embeddings, which provide low-dimensional vector representations of words, capturing their semantic and contextual relationships. This study conducted a comparative analysis of LSTM's performance using FastText and Word2Vec embeddings. Data comprising 28,420 Batak-English sentence pairs were collected from various sources, including the Lets Read Asia website and the "Kamus Batak Toba - Indonesia" dictionary. The sentences were then embedded using both FastText and Word2Vec techniques, and the resulting vectors were fed into the LSTM model. The LSTM model, incorporating encoder and decoder components, was trained over multiple epochs, and its performance was evaluated using the BLEU (Bilingual Evaluation Understudy) score. This metric compares n-grams of the predicted translations with reference translations, providing a measure of translation accuracy. The results indicate that the LSTM model with FastText embeddings consistently outperformed the model with Word2Vec embeddings. The FastText-based model achieved an average BLEU score of 0.9516, compared to 0.9389 for the Word2Vec-based model. This superior performance is attributed to FastText's ability to handle out-of-vocabulary words by leveraging subword information, thus providing more accurate and contextually relevant translations.
    URI
    https://repositori.usu.ac.id/handle/123456789/96206
    Collections
    • Master Theses [621]

    Repositori Institusi Universitas Sumatera Utara (RI-USU)
    Universitas Sumatera Utara | Perpustakaan | Resource Guide | Katalog Perpustakaan
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of USU-IRCommunities & CollectionsBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit DateThis CollectionBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit Date

    My Account

    LoginRegister

    Repositori Institusi Universitas Sumatera Utara (RI-USU)
    Universitas Sumatera Utara | Perpustakaan | Resource Guide | Katalog Perpustakaan
    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV