Peningkatan Akurasi K-Means menggunakan Kombinasi Rapid Centroid Estimation dan Canberra Distance
Improving K-Means Accuracy Using a Combination of Rapid Centroid Estimation and Canberra Distance

Date
2025Author
Sentia, Ayuni
Advisor(s)
Lydia, Maya Silvi
Sawaluddin
Metadata
Show full item recordAbstract
K-Means is a widely used clustering method due to its simplicity; however, it has
limitations related to the random initialization of centroids and its reliance on the
Euclidean Distance metric. This study aims to improve the accuracy of K-Means by
integrating the Rapid Centroid Estimation (RCE) method for initial centroid
selection and employing the Canberra Distance as the distance metric. The number
of clusters is determined using the Elbow method, and clustering performance is
evaluated using the Silhouette Coefficient. Experiments were conducted on two
datasets: Wholesale Customers and New Student Enrollment Data. The results
show that the combination of RCE and Canberra Distance in K-Means yields
significantly improved clustering accuracy. At the optimal number of clusters (k =
3), accuracy increased from 33.30% to 58.40% on the Wholesale Customers
dataset, and from 25.51% to 34.61% on the New Student dataset after applying
Min-max Normalization. The proposed approach demonstrates its effectiveness in
producing higher-quality clustering compared to the standard K-Means without the
integration of RCE and Canberra Distance.
Collections
- Master Theses [623]