Peningkatan Akurasi Long Short-Term Memori (LSTM) Menggunakan Word2Vec dan Fastext untuk Machine Translation Bahasa Batak-Inggris

Nasution, Nur Amalia

Peningkatan Akurasi Long Short-Term Memori (LSTM) Menggunakan Word2Vec dan Fastext untuk Machine Translation Bahasa Batak-Inggris

dc.contributor.advisor	Nababan, Erna Budhiarti
dc.contributor.advisor	Mawengkang, Herman
dc.contributor.author	Nasution, Nur Amalia
dc.date.accessioned	2024-08-27T08:50:27Z
dc.date.available	2024-08-27T08:50:27Z
dc.date.issued	2024
dc.identifier.uri	https://repositori.usu.ac.id/handle/123456789/96206
dc.description.abstract	This research examines the performance of the Long Short-Term Memory (LSTM) algorithm in combination with two word embedding techniques, FastText and Word2Vec, for translating text between the Batak and English languages. LSTM, an advanced form of Recurrent Neural Networks (RNNs), is utilized for its capability to handle sequential data and maintain long-term dependencies. However, LSTM's effectiveness in translation tasks is significantly influenced by the quality of word embeddings, which provide low-dimensional vector representations of words, capturing their semantic and contextual relationships. This study conducted a comparative analysis of LSTM's performance using FastText and Word2Vec embeddings. Data comprising 28,420 Batak-English sentence pairs were collected from various sources, including the Lets Read Asia website and the "Kamus Batak Toba - Indonesia" dictionary. The sentences were then embedded using both FastText and Word2Vec techniques, and the resulting vectors were fed into the LSTM model. The LSTM model, incorporating encoder and decoder components, was trained over multiple epochs, and its performance was evaluated using the BLEU (Bilingual Evaluation Understudy) score. This metric compares n-grams of the predicted translations with reference translations, providing a measure of translation accuracy. The results indicate that the LSTM model with FastText embeddings consistently outperformed the model with Word2Vec embeddings. The FastText-based model achieved an average BLEU score of 0.9516, compared to 0.9389 for the Word2Vec-based model. This superior performance is attributed to FastText's ability to handle out-of-vocabulary words by leveraging subword information, thus providing more accurate and contextually relevant translations.	en_US
dc.language.iso	id	en_US
dc.publisher	Universitas Sumatera Utara	en_US
dc.subject	LSTM	en_US
dc.subject	FastText	en_US
dc.subject	Word2Vec	en_US
dc.subject	Machine Translation	en_US
dc.subject	Batak-English	en_US
dc.subject	SDGs	en_US
dc.title	Peningkatan Akurasi Long Short-Term Memori (LSTM) Menggunakan Word2Vec dan Fastext untuk Machine Translation Bahasa Batak-Inggris	en_US
dc.title.alternative	Improving The Accuracy of Long Short-Term Memory (LSTM) Using Word2Vec and Fastext for Batak-English Machine Translation	en_US
dc.type	Thesis	en_US
dc.identifier.nim	NIM207038026
dc.identifier.nidn	NIDN0026106209
dc.identifier.nidn	NIDN8859540017
dc.identifier.kodeprodi	KODEPRODI55101#Teknik Informatika
dc.description.pages	70 Pages	en_US
dc.description.type	Tesis Magister	en_US

Files in this item

Name:: Peningkatan Akurasi Long Short-Term ...
Size:: 812.4Kb
Format:: PDF
Description:: Cover

View/Open

Name:: Nur Amalia Nasution_Peningkatan ...
Size:: 1.723Mb
Format:: PDF
Description:: Fulltext

View/Open

This item appears in the following Collection(s)

Master Theses [629]
Tesis Magister

Show simple item record