Document Comparison with a Weighted Topic Hierarchy

  • Authors:
  • Alexander F. Gelbukh;Grigori Sidorov;Adolfo Guzman-Arenas

  • Affiliations:
  • -;-;-

  • Venue:
  • DEXA '99 Proceedings of the 10th International Workshop on Database & Expert Systems Applications
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

A method of document comparison based on a hierarchical dictionary of topics (concepts) is described. The hierarchical links in the dictionary are supplied with the weights that are used for detecting the main topics of a document and for determining the similarity between two documents. The method allows for the comparison of documents that do not share any words literally but do share concepts, including comparison of documents in different languages. Also, the method allows for comparison with respect to a specific "aspect," i.e., a specific topic of interest (with its respective subtopics). A system Classifier using the discussed method for document classification and information retrieval is discussed.