Towards the Automatic Construction of Conceptual Taxonomies

Authors:
Dino Ienco;Rosa Meo
Affiliations:
Dipartimento di Informatica, Università di Torino, Italy;Dipartimento di Informatica, Università di Torino, Italy
Venue:
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Year:
2008

Citing 11
Cited 3

Evaluating text categorization

HLT '91 Proceedings of the workshop on Speech and Natural Language
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
On the merits of building categorization systems by supervised clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An investigation of linguistic features and clustering algorithms for topical document clustering

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Concept decompositions for large sparse text data using clustering

Machine Learning
Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies

The VLDB Journal — The International Journal on Very Large Data Bases
TopCat: Data Mining for Topic Identification in a Text Corpus

IEEE Transactions on Knowledge and Data Engineering
Taxonomies by the numbers: building high-performance taxonomies

Proceedings of the 14th ACM international conference on Information and knowledge management
Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The cluster-abstraction model: unsupervised learning of topic hierarchies from text data

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

Navigating within news collections using tag-flakes

Journal of Visual Languages and Computing
Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules

International Journal on Semantic Web & Information Systems
Discovering generalized association rules from Twitter

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we investigate the possibility of an automatic construction of conceptual taxonomies and evaluate the achievable results. The hierarchy is performed by Ward algorithm, guided by Goodman-Kruskal 茂戮驴as proximity measure. Then, we provide a concise description of each cluster by a keyword representative selected by PageRank.The obtained hierarchy has the same advantages - both descriptive and operative - of indices on keywords which partition a set of documents with respect to their content.We performed experiments in a real case - the abstracts of the papers published in ACM TODS in which the papers have been manually classified into the ACM Computing Taxonomy (CT). We evaluated objectively the generated hierarchy by two methods: Jaccard measure and entropy. We obtained good results by both the methods. Finally we evaluated the capability to classify in the categories of the two taxonomies showing that KH provides a greater facility than CT.