Using generalization of syntactic parse trees for taxonomy capture on the web

  • Authors:
  • Boris A. Galitsky;Gábor Dobrocsi;Josep Lluis de la Rosa;Sergei O. Kuznetsov

  • Affiliations:
  • University of Girona, Girona, Catalonia, Spain;University of Girona, Girona, Catalonia, Spain;University of Girona, Girona, Catalonia, Spain;Higher School of Economics, Moscow Russia

  • Venue:
  • ICCS'11 Proceedings of the 19th international conference on Conceptual structures for discovering knowledge
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

We implement a scalable mechanism to build a taxonomy of entities which improves relevance of search engine in a vertical domain. Taxonomy construction starts from the seed entities and mines the web for new entities associated with them. To form these new entities, machine learning of syntactic parse trees (syntactic generalization) is applied to form commonalities between various search results for existing entities on the web. Taxonomy and syntactic generalization is applied to relevance improvement in search and text similarity assessment in commercial setting; evaluation results show substantial contribution of both sources.