dc.description.abstract | In the era of globalization, technology is rapidly advancing, it makes communication be easier through social media. One of the most widely used platform is X, formerly known as Twitter. X has had a huge impact on industry, business, and politic, with 19.5 million users in Indonesia out of a global total of 500 million. However, its popularity also attracts spammers who engage in activities such as political campaigns, dissemination of misleading information, and irrelevant promotions. Spam, defined as unwanted mass messages, disrupts user privacy and convenience. Therefore, research is needed to detect spam and non-spam posts, in order to enhance user the convenience and security of the users. This study aims to detect Indonesian-language spam on social media X based on posts and reposts using the Random Forest Classifier and TF-IDF. The study use 2800 data posts and reposts from X user accounts. Preprocessing stages included removing unwanted variables, emojis, change words to lowercase, removing punctuation or symbols, normalization, stop-word removal, and tokenization. The research employed TF-IDF for word embedding to convert words in the data into vector, which will be identified using the Random Forest Classifier method. The evaluation methods of this research is Confusion Matrix, resulting in an accuracy of 0.97. Based on the evaluation outcomes, it can be concluded that the algorithm used in this study effectively detects spam posts and reposts with high performance. | en_US |