Kinerja Klasifikasi KNN – Gower Dissimilarity dan Regresi Logistik pada Dataset Heterogen
Abstract
The performance of K-Nearest Neighbor (KNN) classification cannot be separated from determining the distance. Classification distance calculations are usually carried out on numeric data types. However, when faced with a heterogeneous dataset, Euclidean Distance experiences problems. To overcome this problem, this research contributes to calculating classification distances, especially nominal, ordinal, binary and numerical types. Distance calculations are carried out using the Gower Dissimilarity Distance technique. Experimental results on three datasets that have two data classes show that this method can produce 71% accuracy when tested on the Bank dataset, 81% on the Churn Modeling dataset and 84% on the House Prices dataset. The results of this experiment show that Gower Dissimilarity is able to solve the problem of calculating classification distances, but is not as stable as Logistic Regression which has been tested in classifying heterogeneous datasets.
Collections
- Master Theses [620]