Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

Authors:
Jeroen De Knijff;Flavius Frasincar;Frederik Hogenboom
Affiliations:
-;-;-
Venue:
Data & Knowledge Engineering
Year:
2013

Citing 21
Cited 1

Word association norms, mutual information, and lexicography

Computational Linguistics
A translation approach to portable ontology specifications

Knowledge Acquisition - Special issue: Current issues in knowledge modeling
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Fast hierarchical clustering and its validation

Data & Knowledge Engineering
Measuring Similarity between Ontologies

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
The C-value/NC-value Method of Automatic Recognition for Multi-Word Terms

ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Taxonomy learning: factoring the structure of a taxonomy into a semantic classification decision

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic glossary extraction: beyond terminology identification

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Taxonomy generation for text segments: A practical web-based approach

ACM Transactions on Information Systems (TOIS)
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
pPOP: Fast yet accurate parallel hierarchical clustering using partitioning

Data & Knowledge Engineering
Taxonomy Learning Using Compound Similarity Measure

WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Advancing Topic Ontology Learning through Term Extraction

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
The Unreasonable Effectiveness of Data

IEEE Intelligent Systems
Toward a Fuzzy Domain Ontology Extraction Method for Adaptive e-Learning

IEEE Transactions on Knowledge and Data Engineering
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning concept hierarchies from text corpora using formal concept analysis

Journal of Artificial Intelligence Research
On how to perform a gold standard based evaluation of ontology learning

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Semi-automatic construction of topic ontologies

EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining

Editorial: Detecting implicit expressions of affect in text using EmotiNet and its extensions

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a framework to automatically construct taxonomies from a corpus of text documents. This framework first extracts terms from documents using a part-of-speech parser. These terms are then filtered using domain pertinence, domain consensus, lexical cohesion, and structural relevance. The remaining terms represent concepts in the taxonomy. These concepts are arranged in a hierarchy with either the extended subsumption method that accounts for concept ancestors in determining the parent of a concept or a hierarchical clustering algorithm that uses various text-based window and document scopes for concept co-occurrences. Our evaluation in the field of management and economics indicates that a trade-off between taxonomy quality and depth must be made when choosing one of these methods. The subsumption method is preferable for shallow taxonomies, whereas the hierarchical clustering algorithm is recommended for deep taxonomies.