Improving text similarity measurement by critical sentence vector model

  • Authors:
  • Wei Li;Kam-Fai Wong;Chunfa Yuan;Wenjie Li;Yunqing Xia

  • Affiliations:
  • Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong;Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong;State Key Laboratory of Intelligent Technology and System, Tsinghua University, Beijing, China;Department of Computing, Hong Kong Polytechnic University, Hung Hom, Hong Kong;Department of Systems Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong

  • Venue:
  • AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose the Critical Sentence Vector Model (CSVM), a novel model to measure text similarity. The CSVM accounts for the structural and semantic information of the document. Compared to existing methods based on keyword vector, e.g. Vector Space Model (VSM), CSVM measures documents similarity by measuring similarity between critical sentence vectors extracted from documents. Experiments show that CSVM outperforms VSM in calculation of text similarity.