dc.description.abstract | This study discusses class imbalance in datasets, where the minority class is significantly smaller than the majority class. This imbalance causes classification algorithms to be biased toward the majority class, leading to poor recognition of the minority class. The ID3 Shannon algorithm has limitations in handling imbalanced data, making it the primary focus of this study. Performance evaluation of the ID3 Shannon and Modified ID3 algorithms reveals that ID3 Shannon utilizes entropy for attribute selection in decision trees, whereas Modified ID3 is designed to overcome Shannon's limitations, particularly in minority class classification. Experimental results on an 80:20 data split indicate that ID3 Shannon achieves better precision for referral class 1 than for referral class 0, with a precision of 0.23 for referral 0 and 0.69 for referral 1 at a tree depth of 6. The recall for referral 1 is notably higher at 0.97 compared to 0.02 for referral 0. Z-score normalization does not significantly impact ID3 Shannon’s performance. Modified ID3 exhibits substantial weaknesses in classifying referral 0, with both precision and recall at 0.00 before and after normalization. For referral 1, the precision is 0.69, recall is 1.00, and the F1-score is 0.82. Under the 70:30 data split, ID3 Shannon performs well in handling referral 1, while Modified ID3 fails to classify referral 0 effectively. For referral 1, Modified ID3 achieves a precision of 0.68, recall of 1.00, and an F1-score of 0.81. Overall, Modified ID3 demonstrates higher accuracy than ID3 Shannon in both data split scenarios at a tree depth of 6, achieving 69.18% accuracy for the 80:20 split and 68.44% for the 70:30 split. In contrast, ID3 Shannon records lower accuracy, with 67.79% for the 80:20 split and 67.71% for the 70:30 split. Tree depth evaluation indicates that increasing depth leads to overfitting in ID3 Shannon. Implementing Minimum Error Pruning (MEP) optimizes tree depth by simplifying the tree structure without compromising accuracy. After applying MEP, ID3 Shannon achieves accuracy comparable to Modified ID3, demonstrating that pruning is more effective in enhancing performance than normalization. ID3 Shannon maintains stable accuracy across different tree depths, suggesting greater adaptability to changes in model complexity compared to Modified ID3. | en_US |