Klasifikasi Kelas pada Data Tidak Seimbang dalam Deteksi Mikrokalsifikasi Menggunakan Smote-Enn dan Xgboost dengan Optimasi Bayesian
Class Classification on Imbalanced Data in Microcalcification Detection Using Smote-Enn and Xgboost with Bayesian Optimization
Abstract
Imbalanced data is a common problem in classification. Class
classification on imbalanced data can be handled with two approaches, which are
data level and algorithm level. The data level is used to balance the class
distribution. The algorithm level is used to improve the classification algorithm.
This research is to handle class classification on imbalanced data and determine
the effectiveness of hyperparameter optimization. In this research, the
classification analysis of microcalcification detection is carried out which has a
class imbalance of 98%. The methods used in this research are SMOTE-ENN as a
method at the data level approach, XGBoost as a method at the algorithm level
approach, and Bayesian optimization as a hyperparameter optimization method.
Classification performance evaluation is performed by comparing XGBoost using
default values and XGBoost using Bayesian optimization. The results show that the
SMOTE-ENN method is able to balance the class distribution of highly imbalanced
data. The XGBoost method is able to form a classification model with a high
accuracy value, but low enough in precision and F-measure values. The Bayesian
Optimization method was able to significantly improve the performance of the
classification performance, where it succeeded in increasing the accuracy,
specificity, precision, and F-measure values, but decreased the recall value. Based
on the results of the analysis in this research, it is found that the XGBoost method
using Bayesian optimization has a better performance evaluation than the
XGBoost method using the default value.
Collections
- Undergraduate Theses [1407]