Penerapan Web Crawling Menggunakan Algoritma Raita pada Pengumpulan Informasi Penginapan di Berastagi

Lubis, Kania Ulfa

Implementation of Web Crawling Using Raita Algorithm for Lodging Information Gathering in Berastagi

View/Open

Cover (896.3Kb)

Fulltext (2.405Mb)

Date

2024

Author

Lubis, Kania Ulfa

Advisor(s)

Amalia

Tarigan, Jos Timanta

Metadata

Show full item record

Abstract

In the digital era, easy access to lodging information is very important. Automated data collection methods through web crawling are an alternative, but challenges related to accuracy and efficiency remain. This research explores the use of Raita algorithm to improve the accuracy and efficiency of lodging data collection in Berastagi. Raita algorithm is a string matching algorithm with the arrangement of characters in the string being matched with the number or order of characters in the same string. Raita algorithm compares the last character of the character pattern contained in the rightmost window until a match occurs. By selecting the right seed URL, developing a web crawler, and analyzing the web structure and characteristics of lodging information, this research found that Raita's algorithm can reduce redundant information and speed up the data collection process. The results of testing the system with the keyword "hotel" resulted in 15 hotels, the keyword "villa" amounted to 40 villas, the keyword "homestay" amounted to 11 homestays, the keyword "cottage" amounted to 3 cottages, and testing without keywords resulted in all lodgings in Berastagi found on the tiket.com website, namely 106 lodgings. Of the 106 lodging information that has been extracted, the percentage of complete lodging information is 54% and the incomplete is 46%. The description of this percentage information is the completeness of information data including name, impression, rating, price, and address of the inn.

URI

https://repositori.usu.ac.id/handle/123456789/95935

Collections

Undergraduate Theses [1273]