Perbandingan Performa Algoritma K-Means dan DBSCAN dalam Clustering pada Data Berdimensi Tinggi

Wulandari, Rati

Perbandingan Performa Algoritma K-Means dan DBSCAN dalam Clustering pada Data Berdimensi Tinggi

dc.contributor.advisor	Yanti, Maulida
dc.contributor.author	Wulandari, Rati
dc.date.accessioned	2025-09-18T07:40:27Z
dc.date.available	2025-09-18T07:40:27Z
dc.date.issued	2025
dc.identifier.uri	https://repositori.usu.ac.id/handle/123456789/108486
dc.description.abstract	Clustering is one of the essential techniques in data analysis for discovering hidden patterns and grouping data based on certain similarities. However, the application of clustering algorithms to high-dimensional data often encounters challenges, particularly due to the presence of noise and irrelevant features. This study aims to analyze and compare the performance of K-Means and DBSCAN algorithms on high-dimensional data under various conditions. Two datasets were used: the Kaggle diabetes dataset with eight medical variables and the human liver gene expression dataset from ARCHS4 consisting of 35,238 gene features. To reduce dimensional complexity, Principal Component Analysis (PCA) wasapplied, ensuring that the cumulative variance retained was not less than 80%. Performance evaluation was carried out using two metrics, namely the Davies-Bouldin Index (DBI) and the Silhouette Score (SS). The results indicate that K-Means demonstrates more stable performance in most scenarios, particularly when the data is clean or relatively homogeneous, with consistently positive Silhouette Scores. On the other hand, DBSCAN performs better in scenarios with high levels of noise as it can explicitly identify outliers, although it tends to classify a large portion of the data as noise under other conditions. Overall, K-Means is more suitable for data with spherical and evenly distributed clusters, whereas DBSCAN is more appro priate for data with varying densities and the presence of noise.	en_US
dc.language.iso	id	en_US
dc.publisher	Universitas Sumatera Utara	en_US
dc.subject	Clustering	en_US
dc.subject	K-Means	en_US
dc.subject	DBSCAN	en_US
dc.subject	Principal Component Analysis	en_US
dc.subject	Davies-Bouldin Index	en_US
dc.subject	Silhouette Score	en_US
dc.subject	High-Dimensional Data	en_US
dc.title	Perbandingan Performa Algoritma K-Means dan DBSCAN dalam Clustering pada Data Berdimensi Tinggi	en_US
dc.title.alternative	Performance Comparison of K-Means and DBSCAN Algorithms in Clustering High-Dimensional Data	en_US
dc.type	Thesis	en_US
dc.identifier.nim	NIM210803045
dc.identifier.nidn	NIDN0024109003
dc.identifier.kodeprodi	KODEPRODI44201#Matematika
dc.description.pages	98 Pages	en_US
dc.description.type	Skripsi Sarjana	en_US
dc.subject.sdgs	SDGs 4. Quality Education	en_US

Files in this item

Name:: Perbandingan Performa Algoritma ...
Size:: 712.8Kb
Format:: PDF
Description:: Cover

View/Open

Name:: Rati Wulandari_Perbandingan ...
Size:: 2.140Mb
Format:: PDF
Description:: Fulltext

View/Open

This item appears in the following Collection(s)

Undergraduate Theses [1496]
Skripsi Sarjana

Show simple item record