Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
Diversity in the use of electronic mail: a preliminary inquiry
ACM Transactions on Information Systems (TOIS)
Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Context as a factor in personal information management systems
Journal of the American Society for Information Science
Siteseer: personalized navigation for the Web
Communications of the ACM
Hierarchic document classification using Ward's clustering method
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Self-organizing maps
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Document clustering for electronic meetings: an experimental comparison of two techniques
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Partitioning-based clustering for Web document categorization
Decision Support Systems - Special issue on WITS '97
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Document organization using Kohonen's algorithm
Information Processing and Management: an International Journal
Working Knowledge: How Organizations Manage What They Know
Working Knowledge: How Organizations Manage What They Know
Design and Evaluation of a Knowledge Management System
IEEE Software
Using AI in Knowledge Management: Knowledge Bases and Ontologies
IEEE Intelligent Systems
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 5 - Volume 5
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Event detection from online news documents for supporting environmental scanning
Decision Support Systems - Special issue: Knowledge management technique
Taxonomy generation for text segments: A practical web-based approach
ACM Transactions on Information Systems (TOIS)
Verifying the proximity and size hypothesis for self-organizing maps
Journal of Management Information Systems - Special section: Exploring the outlands of the MIS discipline
Journal of Management Information Systems
Generating and Browsing Multiple Taxonomies Over a Document Collection
Journal of Management Information Systems
A collaborative filtering-based approach to personalized document clustering
Decision Support Systems
Concept comparison engines: A new frontier of search
Decision Support Systems
Hi-index | 0.00 |
Analysis of prevalent document management practices shows the popular use of categories (e.g., folders) to organize documents for subsequent searches and retrievals. The coherence and distinction of an existing document category can diminish considerably as influxes of new documents arrive over time. The complexity of and effort requirements for document-category management favor an automated approach that can be supported by appropriate document-clustering techniques. A review of the extant literature shows a predominant focus on document content analysis in automated document-category management, which cannot preserve the user's document-grouping preferences. This research develops two advanced evolution-based techniques for preserving user preferences in their management of document categories. The first technique (CE2), which supports the automated evolution of a set of flat (i.e., nonhierarchical) document categories, extends a promising evolution-based technique (category evolution, CE) by addressing its fundamental limitations inherent to the use of holistic measures. The second technique, category hierarchy evolution (CHE), is developed on the basis of CE2 to support scenarios where document categories are organized with a hierarchical structure. Empirical evaluations of the effectiveness of each technique in various category evolution scenarios created using two different document corpora (i.e., news documents from Reuters and research articles from the ACM digital library), as compared with those of associated salient techniques for benchmark purposes, show that CE2 and CHE outperform their respective benchmark techniques. Their performance is reasonably robust and appears more effective when the quality (coherence) of the previously created categories does not deteriorate excessively. According to our results, the evolution-based approach is viable, appealing, and capable of preserving user preferences in automatic reorganizations of document categories.