On exploiting content and citations together to compute similarity of scientific papers

  • Authors:
  • Masoud Reyhani Hamedani;Sang-Wook Kim;Sang-Chul Lee;Dong-Jin Kim

  • Affiliations:
  • Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea;Hanyang University, Seoul, South Korea;NHN Institute of The Next Network, Seoul, South Korea

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In computing the similarity of scientific papers, previous text-based and link-based similarity measures look at only a single side of the content and citations. In this paper, we propose a novel approach called SimCC that effectively combines the content and citation information to accurately compute the similarity of scientific papers. Unlike previous approaches, SimCC effectively represents both authority and context of a scientific paper simultaneously in computing similarities. Also, we propose SimCC+A to consider recently-published papers. The effectiveness of our proposed method is demonstrated via extensive experiments on a real-world dataset of scientific papers, with more than 100% improvement in accuracy compared with previous methods.