Models of incremental concept formation
Artificial Intelligence
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
ACM Computing Surveys (CSUR)
Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
Distribution of content words and phrases in text and language modelling
Natural Language Engineering
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Refining web search engine results using incremental clustering
International Journal of Intelligent Systems - Intelligent Technologies
IRSG'98 Proceedings of the 20th Annual BCS-IRSG conference on Information Retrieval Research
Short communication: Variable space hidden Markov model for topic detection and analysis
Knowledge-Based Systems
Utilizing phrase-similarity measures for detecting and clustering informative RSS news articles
Integrated Computer-Aided Engineering
Generating Fuzzy Equivalence Classes on RSS News Articles for Retrieving Correlated Information
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Aggregated cross-media news visualization and personalization
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Clustering of document collection - A weighting approach
Expert Systems with Applications: An International Journal
A survey of Web clustering engines
ACM Computing Surveys (CSUR)
Dynamicity vs. effectiveness: studying online clustering for scatter/gather
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Efficient approach for incremental Vietnamese document clustering
Proceedings of the eleventh international workshop on Web information and data management
Multi-grain hierarchical topic extraction algorithm for text mining
Expert Systems with Applications: An International Journal
Clustering objects from multiple collections
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Document update summarization using incremental hierarchical clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Generating an event arrangement for understanding news articles on the web
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Research of fast SOM clustering for text information
Expert Systems with Applications: An International Journal
Hierarchical comments-based clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
Document clustering with universum
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Document hierarchies from text and links
Proceedings of the 21st international conference on World Wide Web
Characterization and exploitation of community structure in cover song networks
Pattern Recognition Letters
Efficient jaccard-based diversity analysis of large document collections
Proceedings of the 21st ACM international conference on Information and knowledge management
A stochastic hyperheuristic for unsupervised matching of partial information
Advances in Artificial Intelligence
Aggregated search: A new information retrieval paradigm
ACM Computing Surveys (CSUR)
Hierarchical co-clustering: off-line and incremental approaches
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Incremental hierarchical text document clustering algorithms are important in organizing documents generated from streaming on-line sources, such as, Newswire and Blogs. However, this is a relatively unexplored area in the text document clustering literature. Popular incremental hierarchical clustering algorithms, namely Cobweb and Classit, have not been widely used with text document data. We discuss why, in the current form, these algorithms are not suitable for text clustering and propose an alternative formulation that includes changes to the underlying distributional assumption of the algorithm in order to conform with the data. Both the original Classit algorithm and our proposed algorithm are evaluated using Reuters newswire articles and Ohsumed dataset.