Efficient concept clustering for ontology learning using an event life cycle on the web

Authors:
Sangsoo Sung;Seokkyung Chung;Dennis McLeod
Affiliations:
Google Inc., Parkway Mt. View, CA;Yahoo! Inc., Santa Clara, CA;Univ. of Southern California, Los Angeles, CA
Venue:
Proceedings of the 2008 ACM symposium on Applied computing
Year:
2008

Citing 6
Cited 1

Efficient clustering of high-dimensional data sets with application to reference matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Bursty and Hierarchical Structure in Streams

Data Mining and Knowledge Discovery
WordNet: a lexical database for English

HLT '94 Proceedings of the workshop on Human Language Technology
Ontology-Driven Semantic Matches between Database Schemas

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
A web-based novel term similarity framework for ontology learning

ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part I

Abordagem não supervisionada para extração de conceitos a partir de textos

Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ontology learning integrates many complementary techniques, including machine learning, natural language processing, and data mining. Specifically, clustering techniques facilitate the building of interrelationships between terms by exploiting similarities of concepts. With the rapid growth of the Web, online information has become one of the major information sources. The ontology learning process where traditional clustering algorithms are involved tends to be slow and computationally expensive when the dataset is as large as the Web. To address this problem, we present an efficient concept clustering technique for ontology learning that reduces the number of required pairwise term similarity computations without a loss of quality. Our approach is to identify relevant terms using a computationally inexpensive similarity metric based on an event life cycle in online news articles. Then, we perform more sophisticated similarity computations. Hence, we can build clusters with high precision/recall and high speed. Without a loss of clustering quality, our framework reduces the number of required computations from O(N2) to (N + L2) (L « N) where N is the number of candidate concepts. Our experimental results show that clustering based on our similarity framework can construct concept clusters 1541.07% faster than clustering with all term pair similarity computations.