dc.description.abstract | The Decision Tree Algorithm is an algorithm that has its main advantages compared
to other algorithms, the decision tree algorithm is a classification algorithm that is
commonly used. The Decision Tree C5.0 algorithm has several drawbacks,
including: the C5.0 algorithm and other decision tree methods are often biased
towards splitting whose features have many levels, some problems for the model
may occur such as over-fit or under-fit challenges, changes Big on decision logic
can result in small changes to training data, and Because C5.0 algorithm relies on
parallel separation of axes, C5.0 can experience modeling inconveniences. Data
imbalance causes low accuracy rate in C5.0 algorithm. The boosting algorithm is
an ensemble meta-algorithm method to primarily reduce bias, and therefore
variance. In each iteration, assign different weights to the distribution of the training
data. Each iteration of the upgrade process changes the distribution of the training
data by increasing the weight assigned to examples of incorrect classification and
decreasing the weight assigned to examples of correct classification. The purpose
of this research is to improve the performance of the Decision Tree C5.0
classification method using adaptive boosting (Adaboost) to predict hepatitis
disease using the Confusion matrix. Tests that have been carried out with the
confusion matrix use the Hepatitis dataset in the Decision Tree C5.0 classification
which has an accuracy rate of 77.41% with a classification error rate of 22.58%.
Whereas in the classification of Decision Tree C5.0 Adaboost has a higher accuracy
rate of 83.87%, when compared to Decision Tree C5.0. The Adaboost Decision
Tree C5.0 classification has a misclassification rate of 16.12%. The Heart Disease
dataset has an accuracy rate of 75.92% using the C5.0 algorithm, and 29.54% error
classification and increases after using Adaboost 77.77% accuracy, 24.07% error
classification. Whereas in the Lung Cancer dataset an accuracy of 82.25% using the
C5.0 algorithm has a classification error of 17.74%, accuracy increases 85.48% and
the error classification rate decreases 14.51% after using C5.0 Adaboost. This
difference is caused by the Adaboost algorithm, because the Adaboost algorithm is
able to change a weak classifier into a strong classifier by increasing the weight of
the observations, and Adaboost is also able to reduce the classifier error rate. | en_US |