Penguatan Kualitas Data Klaster dengan Pemanfaatan Cure-Perseptual pada Titik Representatif dan Centroid Cure
Enhancing Cluster Data Quality through the Implementation of Cure-Perceptual on Representative Points and Cure Centroid

Date
2025Author
Ginting, Dewi Sartika Br
Advisor(s)
Efendi, Syahril
Amalia
Sihombing, Poltak
Metadata
Show full item recordAbstract
The CURE (Clustering Using Representatives) algorithm is well-known for its robust
Clustering capabilities, particularly in handling non-linear data structures and the
presence of outliers. However, the conventional CURE approach does not consider the
perceptual significance of individual variables in the Clustering process. This study
proposes an innovative model named CURE-Perceptual, a variant of the original CURE
algorithm that introduces perceptual weighting in the computation of centroid and
representative points. The weights are assigned based on the relative significance of each
dimension in influencing the Clustering outcome, and are applied starting from the second
iteration of the klaster merging process. This enhancement enables the Clustering process
to become more adaptive to multidimensional data contexts and produces more
homogeneous klasters. The model was evaluated using a case study on child stunting data,
comprising 6,500 entries and seven primary attributes. Results demonstrate that CUREPerceptual
significantly reduces the number of identified outliers from 102 with
conventional CURE to 51, accelerates processing time from 14,849 ms to 4,830 ms and
4,782 ms, and improves klaster homogeneity, as reflected by an increase in Silhouette
Score from 0.67 to 0.94. Additionally, the model yields more structured and representative
klaster distributions. Two different weighting schemes were tested to assess the influence
of perceptual emphasis on the final Clustering outcome. The CURE-Perceptual model
offers a flexible and targeted approach to Clustering analysis, especially for complex
datasets with numerous variables. This innovation contributes to the development of more
responsive representative-based Clustering methods aligned with modern data modeling
needs.