Pemilihan Fitur Menggunakan Metode Ensemble Lasso Regression, Random Forest dan Recursive Features Elimination Dalam Klasifikasi Kanker Payudara
Feature Selection Using Ensemble Regression, Random Forest And Recursive Feature Elimination Methods In Breast Cancer Classification
Abstract
Healthcare datasets, especially those used in cancer diagnosis, often present
challenges such as high dimensionality, redundancy, and irrelevant features,
which can reduce the performance and reliability of automated learning
models. This study proposes a robust ensemble feature selection method to
address these challenges, by combining Lasso Regression, Random Forest,
and Recursive Feature Elimination (RFE). By utilizing the complementary
strengths of these algorithms, the ensemble approach aims to improve the
stability of feature selection and enhance classification accuracy. In addition,
Shannon entropy is used to evaluate data complexity and guide the feature
selection process. The proposed method is applied to the Breast Cancer
(Diagnosis) dataset and its performance is evaluated using metrics such as
accuracy, precision, gain, and F1 score. The experimental results show that
the ensemble method outperforms individual feature selection techniques,
achieving higher classification accuracy and reliability in handling complex
and imbalanced datasets. This research advances machine learning-based
diagnostic tools by providing a reliable framework for analyzing high
dimensional medical data. These results highlight the potential of synthetic
feature selection to improve interpretation, reduce computational cost, and
increase the predictive accuracy of breast cancer diagnosis, revealing the
potential of synthetic feature selection.
Collections
- Master Theses [621]