Optimasi Kontinu untuk Speech Emotion Recognition (SER) Berdasarkan Deep Learning dan CMARS

Triandi, Budi

Optimasi Kontinu untuk Speech Emotion Recognition (SER) Berdasarkan Deep Learning dan CMARS

dc.contributor.advisor	Efendi, Syahril
dc.contributor.advisor	Mawengkang, Herman
dc.contributor.advisor	Sawaluddin
dc.contributor.author	Triandi, Budi
dc.date.accessioned	2024-02-16T02:49:46Z
dc.date.available	2024-02-16T02:49:46Z
dc.date.issued	2023
dc.identifier.uri	https://repositori.usu.ac.id/handle/123456789/91300
dc.description.abstract	Emotion is a person's condition that influences events that occur in the human subconscious. A person's emotional condition is something that can influence a person's behavior. One form of identifying a person's emotions can be found in changes in the voice feature data contained therein. Voice signals have complex voice feature data with very large amounts of data and contain uncertain parameters, which cause problems, namely that it becomes difficult to predict potential emotions. This difficulty will increase if the voice signal is limited to one form of emotion. Therefore, it is necessary to carry out further research to conduct a deeper study of non-parametric prediction analysis to optimize influential features so as to obtain optimal patterns that have the potential to change the shape of a person's emotions. The collection of multiple data sets requires analytical solutions that are appropriate to the phenomena involved under the conditions for which the data are measured or for which numerical hypothetical results are available. Many methods have been used to solve this problem, such as optimization to obtain optimal estimates from continuous data for prediction problems. The deep learning approach can be used in the field of Speech Emotion Recognition (SER) research, but an alternative solution is needed to overcome problems that often occur, such as over fitting data due to long trial-and-error learning. This dissertation research focuses on optimization in SER problems by developing learning techniques using nonparametric regression using conic multivariate adaptive regression splines (CMARS) to describe the relationship between the dependent variable and the independent variable and mathematically interpret the relationship between various parameters. This research uses the Ryerson Audio-Visual Database of Emotional Speech and Song dataset (RAVDESS) with 600 sound data files in *.WAV format as training data and test data. To extract sound feature data, the data extraction technique used is the mel-frequency cepstral coefficient (MFCC) feature to obtain the mel-cepstral coefficient. The results of the tests carried out in this dissertation research obtained a generalized cross validation (GCV) value of 0.0130 with an estimated root mean squared error (RMSE) of 0.0062, and the level of prediction suitability reached R2_Score (RSq) = 0, 9720, or 97.20%, meaning that the proposed model is the best nonparametric CMARS regression model for speech emotion recognition (SER) problems.	en_US
dc.language.iso	id	en_US
dc.publisher	Universitas Sumatera Utara	en_US
dc.subject	Optimization	en_US
dc.subject	Speech Emotion Recognition (SER)	en_US
dc.subject	Mel-cepstral coefficient	en_US
dc.subject	Conic Multivariate Adaptive Regression Splines	en_US
dc.subject	RAVDESS dataset	en_US
dc.subject	SDGs	en_US
dc.title	Optimasi Kontinu untuk Speech Emotion Recognition (SER) Berdasarkan Deep Learning dan CMARS	en_US
dc.type	Thesis	en_US
dc.identifier.nim	NIM198123012
dc.identifier.nidn	NIDN0010116706
dc.identifier.nidn	NIDN8859540017
dc.identifier.nidn	NIDN0031125982
dc.identifier.kodeprodi	KODEPRODI55001#Ilmu Komputer
dc.description.pages	171 Halaman	en_US
dc.description.type	Disertasi Doktor	en_US

Files in this item

Name:: Cover.pdf
Size:: 679.0Kb
Format:: PDF
Description:: Cover

View/Open

Name:: 198123012.pdf
Size:: 6.893Mb
Format:: PDF
Description:: Fulltext

View/Open

This item appears in the following Collection(s)

Doctoral Dissertations [67]
Disertasi

Show simple item record