Inducing classes of terms from text

  • Authors:
  • Pablo Gamallo;Gabriel P. Lopes;Alexandre Agustini

  • Affiliations:
  • Faculdade de Filologia, Universidade de Santiago de Compostela, Spain;Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Portgual;Departamento de Informática, PUCRS, Brazil

  • Venue:
  • TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a clustering method for organizing in semantic classes a list of terms. The experiments were made using a POS annotated corpus, the ACL Anthology, which consists of technical articles in the field of Computational Linguistics. The method, mainly based on some assumptions of Formal Concept Analysis, consists in building bi-dimensional clusters of both terms and their lexico-syntactic contexts. Each generated cluster is defined as a semantic class with a set of terms describing the extension of the class and a set of contexts perceived as the intensional attributes (or properties) valid for all the terms in the extension. The clustering process relies on two restrictive operations: abstraction and specification. The result is a concept lattice that describes a domain-specific ontology of terms.