DCFLA: A distributed collaborative-filtering neighbor-locating algorithm

  • Authors:
  • Bo Xie;Peng Han;Fan Yang;Rui-Min Shen;Hua-Jun Zeng;Zheng Chen

  • Affiliations:
  • Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China;Microsoft Research Asia, 5F Sigma Center, 49 Zhichun Road, Beijing 100080, China;Microsoft Research Asia, 5F Sigma Center, 49 Zhichun Road, Beijing 100080, China

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2007

Quantified Score

Hi-index 0.07

Visualization

Abstract

Although collaborative filtering (CF) has proved to be one of the most successful techniques in recommendation systems, it suffers from a lack of scalability as the time complexity rapidly increases when the number of the records in the user database increases. As a result, distributed collaborative filtering (DCF) is attracting increasing attention as an alternative implementation scheme for CF-based recommendation systems. In this paper, we first propose a distributed user-profile management scheme using distributed hash table (DHT)-based routing algorithms, which is one of the most popular and effective approaches in peer-to-peer (P2P) overlay networks. In this DCF scheme, an efficient DCF neighbor-locating algorithm (DCFLA) is proposed, together with two improvements, most same opinion (MSO) and average rating normalization (ARN), to reduce the network traffic and time cost. Finally, we analyze the performance of one baseline and three novel CF algorithms are being proposed: (1) a traditional memory-based CF (baseline); (2) a basic DHT-based CF; (3) a DHT-based CF with MSO; and (4) a DHT-based CF with MSO and ARN. The experimental results show that the scalability of our proposed DCFLA is much better than the traditional centralized CF algorithm and the prediction accuracies of these two systems are comparable.