Focused Crawling for Information Gathering Using Hidden Markov Model

Author: Tsung-Kun Shih (施宗昆)

Publish Year: 2008-07

Update by: March 26, 2025

摘要

Information search is the key activity for many users on the Web. Although search engines are very useful and powerful nowadays, there are also many drawbacks faced by them. Moreover, many information needs are hard to express using keyword-based queries. In this paper, we apply a method to solve composite information needs by building a Hidden Markov Model (HMM) for predicting the most likely path to the target information. We want to use the concept of the focused crawling to trace down a Web site for specific information. The experiment shows that the results is good for the admission information and the accepted papers.