Journal of the American Society for Information Science
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Optimal Expected-Time Algorithms for Closest Point Problems
ACM Transactions on Mathematical Software (TOMS)
The nearest neighbour problem in information retrieval: an algorithm using upperbounds
SIGIR '81 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrieval
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
A probabilistic algorithm for nearest neighbour searching
SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Parallel Computations in Information Retrieval
Parallel Computations in Information Retrieval
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based text categorization: a comparison of category search strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering based on cluster validation
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Interpretable Hierarchical Clustering by Constructing an Unsupervised Decision Tree
IEEE Transactions on Knowledge and Data Engineering
Combining preference- and content-based approaches for improving document clustering effectiveness
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal
Journal of Management Information Systems
A collaborative filtering-based approach to personalized document clustering
Decision Support Systems
A Latent Semantic Indexing-based approach to multilingual document clustering
Decision Support Systems
Managing Word Mismatch Problems in Information Retrieval: A Topic-Based Query Expansion Approach
Journal of Management Information Systems
IEICE - Transactions on Information and Systems
Preserving User Preferences in Automated Document-Category Management: An Evolution-Based Approach
Journal of Management Information Systems
Re-ranking search results using language models of query-specific clusters
Information Retrieval
Expert Systems with Applications: An International Journal
Combining preference- and content-based approaches for improving document clustering effectiveness
Information Processing and Management: an International Journal
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Cross-lingual text categorization: Conquering language boundaries in globalized environments
Information Processing and Management: an International Journal
SAM method as an approach to select candidates for human prostate cancer markers
BSB'05 Proceedings of the 2005 Brazilian conference on Advances in Bioinformatics and Computational Biology
Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval
Information Processing and Management: an International Journal
Hi-index | 0.00 |
In this paper, we discuss the application of a recent hierarchic clustering algorithm to the automatic classification of files of documents. Whereas most hierarchic clustering algorithms involve the generation and updating of an inter-object dissimilarity matrix, this new algorithm is based upon a series of nearest neighbor searches. Such an approach is appropriate to several clustering methods, including Ward's method which has been shown to perform well in experimental studies of hierarchic document clustering. A description is given of heuristics which can increase the efficiency of the new algorithm when it is used to cluster three document collections by Ward's method.