Using latent semantic analysis to improve access to textual information
CHI '88 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Self organization of a massive document collection
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Different from familiar clustering objects, text documents have sparse data spaces. A common way of representing a document is as a bag of its component words, but the semantic relations between words are ignored. In this paper, we propose a novel document representation approach to strengthen the discriminative feature of document objects. We replace terms of documents with concepts in WordNet and construct a model named Concept CHain Model(CCHM) for document representation. CCHM is applied in both partitioning and agglomerative clustering analysis. Hierarchical clustering processes in different levels of concept chains. The experimental evaluation on textual data sets demonstrates the validity and efficiency of CCHM. The results of experiments with concept show the superiority of our approach in hierarchical clustering.