Measuring the interestingness of articles in a limited user environment

  • Authors:
  • R. K. Pon;A. F. Cárdenas;D. J. Buttler;T. J. Critchlow

  • Affiliations:
  • University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, CA 90095, United States;University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, CA 90095, United States;Lawrence Livermore National Laboratory, 7000 East Ave., Livermore, CA 94550, United States;Pacific Northwest National Laboratory, PO Box 999, Richland, WA 99352, United States

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Search engines, such as Google, assign scores to news articles based on their relevance to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevance scores do not take into account what makes an article interesting, which would vary from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A general framework, called iScore, is presented for defining and measuring the ''interestingness'' of articles, incorporating user-feedback. iScore addresses the various aspects of what makes an article interesting, such as topic relevance, uniqueness, freshness, source reputation, and writing style. It employs various methods, such as multiple topic tracking, online parameter selection, language models, clustering, sentiment analysis, and phrase extraction to measure these features. Due to varying reasons that users hold about why an article is interesting, an online feature selection method in nai@?ve Bayes is also used to improve recommendation results. iScore can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg.