Prediction of Web Page Accesses by Proxy Server Log

As the population of web users grows, the variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. Recently, many efforts have been made to analyze user behavior on the WWW. In this paper, we represent user behavior by sequences of consecutive web page accesses, derived from the access log of a proxy server. Moreover, the frequent sequences are discovered and organized as an index. Based on the index, we propose a scheme for predicting user requests and a proxy-based framework for prefetching web pages. We perform experiments on real data. The results show that our approach makes the predictions with a high degree of accuracy with little overhead. In the experiments, the highest hit ratio of the prediction achieves 75.69%, while the longest time to make a prediction only requires 1.9 ms.