Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
The vocabulary problem in human-system communication
Communications of the ACM
Optimal determination of user-oriented clusters
SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Siteseer: personalized navigation for the Web
Communications of the ACM
Hierarchic document classification using Ward's clustering method
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
User-oriented document clustering: a framework for learning in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Self-organizing maps
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On the merits of building categorization systems by supervised clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Document clustering for electronic meetings: an experimental comparison of two techniques
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
A semi-supervised document clustering technique for information organization
Proceedings of the ninth international conference on Information and knowledge management
Using clustering and classification approaches in interactive retrieval
Information Processing and Management: an International Journal - Special issue on interactivity at the text retrieval conference (TREC)
An effective document clustering method using user-adaptable distance metrics
Proceedings of the 2002 ACM symposium on Applied computing
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Document organization using Kohonen's algorithm
Information Processing and Management: an International Journal
Information navigation on the web by clustering and summarizing query results
Information Processing and Management: an International Journal
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Verifying the proximity and size hypothesis for self-organizing maps
Journal of Management Information Systems - Special section: Exploring the outlands of the MIS discipline
Journal of Management Information Systems
A collaborative filtering-based approach to personalized document clustering
Decision Support Systems
A Latent Semantic Indexing-based approach to multilingual document clustering
Decision Support Systems
Managing Word Mismatch Problems in Information Retrieval: A Topic-Based Query Expansion Approach
Journal of Management Information Systems
A document classification and retrieval system for R&D in semiconductor industry - A hybrid approach
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
E-commerce and knowledge management applications generate and consume tremendous amounts of online information that is typically available as textual documents. To facilitate subsequent access of and leverage from these textual documents, the efficient and effective management of the ever-increasing volume of documents is essential to both organizations and individuals. Document management practices suggest the popularity of using categories (e.g., folders) for organizing, archiving, and accessing documents. Document clustering represents an appealing approach to enable organizations or individuals to create and maintain document categories automatically. Existing document clustering techniques usually group together similar documents on the basis of their textual content similarity. However, such content-based approaches operate at the lexical level and suffer greatly from the word mismatch problem. Therefore, this study aims to address this problem by exploiting users' document grouping preferences, as exhibited in those individuals' folder sets, to support document clustering. Specifically, we propose a hybrid document clustering technique that combines preference- and content-based approaches. Using a traditional content-based and a preference/ content switching document clustering technique as performance benchmarks, our empirical evaluation results show that the proposed hybrid technique improves the clustering effectiveness measured by both cluster precision and cluster recall.