• Login
    View Item 
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Undergraduate Theses
    • View Item
    •   USU-IR Home
    • Faculty of Computer Science and Information Technology
    • Department of Information Technology
    • Undergraduate Theses
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Implementasi Algoritma LSTM pada Speech Recognition dalam Aplikasi Mobile untuk Membantu Tunanetra Membaca Dokumen

    Implementation of LSTM Algorithm in Speech Recognition in Mobile Application to Help Blind People Read Documents

    Thumbnail
    View/Open
    Cover (928.1Kb)
    Fulltext (2.408Mb)
    Date
    2025
    Author
    Siahaan, Gabryelle Ninna Deffanya
    Advisor(s)
    Pulungan, Annisa Fadhillah
    Nurhasanah, Rossy
    Metadata
    Show full item record
    Abstract
    Blind and visually impaired individuals face challenges in accessing printed documents due to the limited availability of braille formats and the high cost of reading aids. This accessibility gap restricts their independence in obtaining information. This study aims to implement a recurrent neural network–based algorithm, specifically Long Short-Term Memory (LSTM), for the speech-recognition feature in a mobile application designed to help visually impaired users read documents. The app integrates Optical Character Recognition (OCR), Text-to-Speech (TTS), and voice commands. The primary focus of the research is the development of the voice-command feature, enabling users to operate the application independently without relying on others. The command-speech dataset used in this research consists of recordings of “Foto” (Photo), “Info” (Info), “Baca” (Read), “Ulang” (Repeat), “Berhenti” (Stop), and “Kembali” (Back), from 53 male and female respondents across various age ranges. The data undergo preprocessing steps—including audio loading, standardization, noise reduction, and band-pass filtering—followed by extraction of Mel-Frequency Cepstral Coefficients (MFCC), label encoding, and padding before being fed into the LSTM model. The best model in this study achieved a testing accuracy of 96.6%. Implementation is carried out using FastAPI to connect the Android mobile application with the speech recognition model. User testing with five visually impaired participants for each test yielded a User Satisfaction Score (USS) of 4.4 out of 5.
    URI
    https://repositori.usu.ac.id/handle/123456789/105717
    Collections
    • Undergraduate Theses [858]

    Repositori Institusi Universitas Sumatera Utara - 2025

    Universitas Sumatera Utara

    Perpustakaan

    Resource Guide

    Katalog Perpustakaan

    Journal Elektronik Berlangganan

    Buku Elektronik Berlangganan

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of USU-IRCommunities & CollectionsBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit DateThis CollectionBy Issue DateTitlesAuthorsAdvisorsKeywordsTypesBy Submit Date

    My Account

    LoginRegister

    Repositori Institusi Universitas Sumatera Utara - 2025

    Universitas Sumatera Utara

    Perpustakaan

    Resource Guide

    Katalog Perpustakaan

    Journal Elektronik Berlangganan

    Buku Elektronik Berlangganan

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV