Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
A system for discovering relationships by feature extraction from text databases
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Frequent term-based text clustering
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An Adaptive Meta-Clustering Approach: Combining the Information from Different Clustering Results
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Document clustering by concept factorization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Exploiting Wikipedia as external knowledge for document clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A comparative study of ontology based term similarity measures on PubMed document clustering
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Term weighting evaluation in bipartite partitioning for text clustering
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Learning ontology resolution for document representation and its applications in text mining
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Towards bipartite graph data management
CloudDB '10 Proceedings of the second international workshop on Cloud data management
Ontology enhancement and concept granularity learning: keeping yourself current and adaptive
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Representing document as dependency graph for document clustering
Proceedings of the 20th ACM international conference on Information and knowledge management
Enriching short text representation in microblog for clustering
Frontiers of Computer Science in China
Ontology-enriched multi-document summarization in disaster management using submodular function
Information Sciences: an International Journal
A semantic social network-based expert recommender system
Applied Intelligence
Hi-index | 0.00 |
We introduce a novel document clustering approach that overcomes those problems by combining a semantic-based bipartite graph representation and a mutual refinement strategy. The primary contributions of this paper are the following. First, we introduce a new representation of documents using a bipartite graph between documents and co-occurrence concepts in the documents. Second, we show how to enhance clustering quality by applying the mutual refinement strategy to the initial clustering results. Third, through the experiments on MEDLINE documents, we show that our integrated method significantly enhances cluster quality and clustering reliability compared to existing clustering methods. Our approach improves on the average 29.5 cluster quality and 26.3 clustering reliability, in terms of misclassification index, over Bisecting K-means with the best parameters.