Indexing URL Profiles for Efficient Web Page Filtering Services

The rapid accumulation of web pages makes searching on the WWW a very important issue. At present, search engines are often used to help users find what they want. However, most of them bring the users tons of results. To alleviate this difficulty, we propose a framework of web page filtering, in which the users describe what they need in two kinds of profiles (keyword profiles and URL profiles). The keywords extracted from each web page are used to find the matched keyword profiles. On the other hand, the URLs embedded in each web page are used to find the matched URL profiles. As a result, the web page and its embedded URLs are recommended to the users who are interested in them. In this paper, we devote to the indexing of URL profiles and the performance analysis of different methods. Furthermore, we simulate and compare these methods under various parameter settings.