Inducing classes of terms from text

Authors:
Pablo Gamallo;Gabriel P. Lopes;Alexandre Agustini
Affiliations:
Faculdade de Filologia, Universidade de Santiago de Compostela, Spain;Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Portgual;Departamento de Informática, PUCRS, Brazil
Venue:
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Year:
2007

Citing 7
Cited 3

Induction of semantic classes from natural language text

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Learning word clusters from data types

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Clustering Syntactic Positions with Similar Semantic Requirements

Computational Linguistics
Formal concept analysis in information science

Annual Review of Information Science and Technology
Is shallow parsing useful for unsupervised learning of semantic clusters?

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing

Automatic fine-grained semantic classification for domain adaptation

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
Abordagem não supervisionada para extração de conceitos a partir de textos

Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web
Review: Formal concept analysis in knowledge processing: A survey on applications

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a clustering method for organizing in semantic classes a list of terms. The experiments were made using a POS annotated corpus, the ACL Anthology, which consists of technical articles in the field of Computational Linguistics. The method, mainly based on some assumptions of Formal Concept Analysis, consists in building bi-dimensional clusters of both terms and their lexico-syntactic contexts. Each generated cluster is defined as a semantic class with a set of terms describing the extension of the class and a set of contexts perceived as the intensional attributes (or properties) valid for all the terms in the extension. The clustering process relies on two restrictive operations: abstraction and specification. The result is a concept lattice that describes a domain-specific ontology of terms.