Collaborative Filtering by Mining Association Rules from User Access Sequences

  • Authors:
  • Mei-Ling Shyu;Choochart Haruechaiyasak;Shu-Ching Chen;Na Zhao

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Miami Coral Gables, FL, USA;Information Research and Development Division (RDI) National Electronics and Computer Technology Center (NECTEC) Thailand Science Park;Distributed Multimedia Information System Laboratory, School of Computer Florida International University, Miami, FL, USA;Distributed Multimedia Information System Laboratory, School of Computer Florida International University, Miami, FL, USA

  • Venue:
  • WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent research in mining user access patterns for predicting Web page requests focuses only on consecutive sequential Web page accesses, i.e., pages which are accessed by following the hyperlinks. In this paper, we propose a new method for mining user access patterns that allows the prediction of multiple non-consecutive Web pages, i.e., any pages within theWeb site. Our approach consists of two major steps. First, the shortest path algorithm in graph theory is applied to find the distances between Web pages. In order to capture user access behavior on the Web, the distances are derived from user access sequences, as opposed to static structural hyperlinks. We refer to these distances as Minimum Reaching Distance (MRD) information. The association rule mining (ARM) technique is then applied to form a set of predictive rules which are further refined and pruned by using the MRD information. The proposed approach is applied as a collaborative filtering technique to recommend Web pages within a Web site. Experimental results demonstrate that our approach improves performance over the existing Markov model approach in terms of precision and recall, and also has a better potential of reducing the user access time on the Web.