Penguatan Kualitas Data Klaster dengan Pemanfaatan Cure-Perseptual pada Titik Representatif dan Centroid Cure

Ginting, Dewi Sartika Br

Enhancing Cluster Data Quality through the Implementation of Cure-Perceptual on Representative Points and Cure Centroid

View/Open

Cover (403.3Kb)

Fulltext (2.966Mb)

Date

2025

Author

Ginting, Dewi Sartika Br

Advisor(s)

Efendi, Syahril

Amalia

Sihombing, Poltak

Metadata

Show full item record

Abstract

The CURE (Clustering Using Representatives) algorithm is well-known for its robust Clustering capabilities, particularly in handling non-linear data structures and the presence of outliers. However, the conventional CURE approach does not consider the perceptual significance of individual variables in the Clustering process. This study proposes an innovative model named CURE-Perceptual, a variant of the original CURE algorithm that introduces perceptual weighting in the computation of centroid and representative points. The weights are assigned based on the relative significance of each dimension in influencing the Clustering outcome, and are applied starting from the second iteration of the klaster merging process. This enhancement enables the Clustering process to become more adaptive to multidimensional data contexts and produces more homogeneous klasters. The model was evaluated using a case study on child stunting data, comprising 6,500 entries and seven primary attributes. Results demonstrate that CUREPerceptual significantly reduces the number of identified outliers from 102 with conventional CURE to 51, accelerates processing time from 14,849 ms to 4,830 ms and 4,782 ms, and improves klaster homogeneity, as reflected by an increase in Silhouette Score from 0.67 to 0.94. Additionally, the model yields more structured and representative klaster distributions. Two different weighting schemes were tested to assess the influence of perceptual emphasis on the final Clustering outcome. The CURE-Perceptual model offers a flexible and targeted approach to Clustering analysis, especially for complex datasets with numerous variables. This innovation contributes to the development of more responsive representative-based Clustering methods aligned with modern data modeling needs.

URI

https://repositori.usu.ac.id/handle/123456789/105298

Collections

Doctoral Dissertations [67]