Using Ontology in Hierarchical Information Clustering

Authors:
Travis D. Breaux;Joel W. Reed
Affiliations:
North Carolina State University;Oak Ridge National Laboratory
Venue:
HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 4 - Volume 04
Year:
2005

Citing 0
Cited 4

Exploiting noun phrases and semantic relationships for text document clustering

Information Sciences: an International Journal
A Term-Based Driven Clustering Approach for Name Disambiguation

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
A latent image semantic indexing scheme for image retrieval on the web

WISE'06 Proceedings of the 7th international conference on Web Information Systems
Clustering OWL documents based on semantic analysis

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The tools to analyze and visualize information from multiple, heterogeneous sources have often relied on innovations in statistical methods. The results from purely statistical methods, however, overlook relevant semantic features present within natural language and text-based information. Emerging research in ontology languages (e.g. RDF, RDFS, SUO-KIF, and OWL) offers promising avenues for overcoming these limitations by leveraging existing and future libraries of meta-data and semantic mark-up. Using semantic features (e.g. hypernyms, meronyms, synonyms, etc.) encoded in ontology languages, methods such as keyword search and clustering can be augmented to analyze and visualize documents at conceptually richer levels. We present findings from a hierarchical clustering system modified for ontological indexing and run on a topic-centric test collection of documents each with fewer than 200 words. Our findings show that ontologies can impose a complete interpretation or subjective clustering onto a document set that is at least as good as meta-word search.