Spelling Correction untuk Non-Word Errors Bahasa Batak Toba menggunakan Metode Peter Norvig
Spelling Correction for Non-Word Errors in Batak Toba Language Using the Peter Norvig Method
Abstract
The use of Batak Toba language in digital communication often faces challenges, especially in typing and spelling accuracy. One of the most common issues is non-word errors, which occur when words are misspelled in a way that they no longer match any valid word in the dictionary. For example, the correct word “pailahon” (meaning “embarrassing”) might be mistyped as “paolahn.” These kinds of errors can affect readability, lower the quality of communication, and in the long term, may hinder efforts to preserve the Batak Toba language in digital form. To address this problem, this study develops a spelling correction system focused on handling non-word errors in Batak Toba using the Peter Norvig method. This method combines a probability-based approach and edit distance calculations to find and suggest the most likely correct words. The system uses a dataset that includes a Batak Toba word corpus of 5.000 words and a test set containing 498 words. Based on the evaluation, the system was able to detect and correct spelling mistakes with satisfying results. The system achieved an accuracy of 90,76%, with 452 out of 498 incorrect words successfully corrected. In addition, the system recorded a precision score of 90,16%, a recall of 90,76%, and an F1-score of 90,42%. These results show that the Peter Norvig method works well for spelling correction, especially for local languages like Batak Toba, which have limited digital resources.
Collections
- Undergraduate Theses [1235]