Show simple item record

dc.contributor.advisorEfendi, Syahril
dc.contributor.advisorMawengkang, Herman
dc.contributor.advisorSawaluddin
dc.contributor.authorTriandi, Budi
dc.date.accessioned2024-02-16T02:49:46Z
dc.date.available2024-02-16T02:49:46Z
dc.date.issued2023
dc.identifier.urihttps://repositori.usu.ac.id/handle/123456789/91300
dc.description.abstractEmotion is a person's condition that influences events that occur in the human subconscious. A person's emotional condition is something that can influence a person's behavior. One form of identifying a person's emotions can be found in changes in the voice feature data contained therein. Voice signals have complex voice feature data with very large amounts of data and contain uncertain parameters, which cause problems, namely that it becomes difficult to predict potential emotions. This difficulty will increase if the voice signal is limited to one form of emotion. Therefore, it is necessary to carry out further research to conduct a deeper study of non-parametric prediction analysis to optimize influential features so as to obtain optimal patterns that have the potential to change the shape of a person's emotions. The collection of multiple data sets requires analytical solutions that are appropriate to the phenomena involved under the conditions for which the data are measured or for which numerical hypothetical results are available. Many methods have been used to solve this problem, such as optimization to obtain optimal estimates from continuous data for prediction problems. The deep learning approach can be used in the field of Speech Emotion Recognition (SER) research, but an alternative solution is needed to overcome problems that often occur, such as over fitting data due to long trial-and-error learning. This dissertation research focuses on optimization in SER problems by developing learning techniques using nonparametric regression using conic multivariate adaptive regression splines (CMARS) to describe the relationship between the dependent variable and the independent variable and mathematically interpret the relationship between various parameters. This research uses the Ryerson Audio-Visual Database of Emotional Speech and Song dataset (RAVDESS) with 600 sound data files in *.WAV format as training data and test data. To extract sound feature data, the data extraction technique used is the mel-frequency cepstral coefficient (MFCC) feature to obtain the mel-cepstral coefficient. The results of the tests carried out in this dissertation research obtained a generalized cross validation (GCV) value of 0.0130 with an estimated root mean squared error (RMSE) of 0.0062, and the level of prediction suitability reached R2_Score (RSq) = 0, 9720, or 97.20%, meaning that the proposed model is the best nonparametric CMARS regression model for speech emotion recognition (SER) problems.en_US
dc.language.isoiden_US
dc.publisherUniversitas Sumatera Utaraen_US
dc.subjectOptimizationen_US
dc.subjectSpeech Emotion Recognition (SER)en_US
dc.subjectMel-cepstral coefficienten_US
dc.subjectConic Multivariate Adaptive Regression Splinesen_US
dc.subjectRAVDESS dataseten_US
dc.subjectSDGsen_US
dc.titleOptimasi Kontinu untuk Speech Emotion Recognition (SER) Berdasarkan Deep Learning dan CMARSen_US
dc.typeThesisen_US
dc.identifier.nimNIM198123012
dc.identifier.nidnNIDN0010116706
dc.identifier.nidnNIDN8859540017
dc.identifier.nidnNIDN0031125982
dc.identifier.kodeprodiKODEPRODI55001#Ilmu Komputer
dc.description.pages171 Halamanen_US
dc.description.typeDisertasi Doktoren_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record