Efficient concept clustering for ontology learning using an event life cycle on the web

  • Authors:
  • Sangsoo Sung;Seokkyung Chung;Dennis McLeod

  • Affiliations:
  • Google Inc., Parkway Mt. View, CA;Yahoo! Inc., Santa Clara, CA;Univ. of Southern California, Los Angeles, CA

  • Venue:
  • Proceedings of the 2008 ACM symposium on Applied computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ontology learning integrates many complementary techniques, including machine learning, natural language processing, and data mining. Specifically, clustering techniques facilitate the building of interrelationships between terms by exploiting similarities of concepts. With the rapid growth of the Web, online information has become one of the major information sources. The ontology learning process where traditional clustering algorithms are involved tends to be slow and computationally expensive when the dataset is as large as the Web. To address this problem, we present an efficient concept clustering technique for ontology learning that reduces the number of required pairwise term similarity computations without a loss of quality. Our approach is to identify relevant terms using a computationally inexpensive similarity metric based on an event life cycle in online news articles. Then, we perform more sophisticated similarity computations. Hence, we can build clusters with high precision/recall and high speed. Without a loss of clustering quality, our framework reduces the number of required computations from O(N2) to (N + L2) (L « N) where N is the number of candidate concepts. Our experimental results show that clustering based on our similarity framework can construct concept clusters 1541.07% faster than clustering with all term pair similarity computations.