Information shared by many objects

  • Authors:
  • Chong Long;Xiaoyan Zhu;Ming Li;Bin Ma

  • Affiliations:
  • Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada

  • Venue:
  • Proceedings of the 17th ACM conference on Information and knowledge management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

If Kolmogorov complexity [25] measures information in one object and Information Distance measures information shared by two objects, how do we measure information shared by many objects? This paper provides an initial pragmatic study of this fundamental data mining question. Firstly, Em(x1,x2,...,xn) is defined to be the minimum amount of thermodynamic energy needed to convert from any xi to any xj. With this definition several theoretical problems have been solved. Second, our newly proposed theory is applied to select a comprehensive review and a specialized review from many reviews: (1) Core feature words, expanded words and dependent words are extracted respectively. (2) Comprehensive and specialized reviews are selected according to the information among them. This method of selecting a single review can be extended to select multiple reviews as well. Finally, experiments show that this comprehensive and specialized review mining method based on our new theory can do the job efficiently.