Pattern-based automatic taxonomy learning from the Web

  • Authors:
  • David Sá/nchez;Antonio Moreno

  • Affiliations:
  • (Correspd.) Dept. of Comp. Sci. and Math. (DEIM), Univ. Rovira i Virgili (URV), Avda. Paï/sos Catalans, 26, 43007 Tarragona, Spain. Tel.: +34 977 559681/ Fax: +34 977 559710/ E-mail: david.san ...;Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA) Research Group, Department of Computer Science and Mathematics (DEIM), University Rovira i Virgili (URV), 43007 Tarragona, Spain

  • Venue:
  • AI Communications
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The construction of taxonomies is considered as the first step for structuring domain knowledge. Many methodologies have been developed in the past for building taxonomies from classical information repositories such as dictionaries, databases or domain text. However, in the last years, scientists have started to consider the Web as valuable repository of knowledge. In this paper we present a novel approach, especially adapted to the Web environment, for composing taxonomies in an automatic and unsupervised way. It uses a combination of different types of linguistic patterns for hyponymy extraction and carefully designed statistical measures to infer information relevance. The learning performance of the different linguistic patterns and statistical scores considered is carefully studied and evaluated in order to design a method that maximizes the quality of the results. Our proposal is also evaluated for several well distinguished domains, offering, in all cases, reliable taxonomies considering precision and recall.