The effect of sparsity on collaborative filtering metrics

  • Authors:
  • Jesus Bobadilla;Francisco Serradilla

  • Affiliations:
  • Universidad Politecnica de Madrid, Madrid, Spain;Universidad Politecnica de Madrid, Madrid, Spain

  • Venue:
  • ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a detailed study of the behavior of three different content-based collaborative filtering metrics (correlation, cosine and mean squared difference) when they are processed on several ratio matrices with different levels of sparsity. The total number of experiments carried out is 648, in which the following parameters are varied: metric used, number of k-neighborhoods, sparsity level and type of result (mean absolute error, percentage of incorrect predictions, percentage of correct predictions and capacity to generate predictions). The results are illustrated in two and three-dimensional representative graphs. The conclusions of the paper emphasize the superiority of the correlation metric over the cosine metric, and the unusually good results of the mean squared difference metric when used on matrices with high sparsity levels, leading us to interesting future studies.