Data mining methods for knowledge discovery
Data mining methods for knowledge discovery
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
An investigation of linguistic features and clustering algorithms for topical document clustering
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Integrating contextual information to enhance SOM-based text document clustering
Neural Networks - New developments in self-organizing maps
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient Phrase-Based Document Indexing for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
Word classification and hierarchy using co-occurrence word information
Information Processing and Management: an International Journal
The BankSearch web document dataset: investigating unsupervised clustering and category similarity
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
2005 Special Issue: Efficient streaming text clustering
Neural Networks - 2005 Special issue: IJCNN 2005
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
An Efficient Clustering Algorithm for Small Text Documents
WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
Gradual model generator for single-pass clustering
Pattern Recognition
Finding biclusters by random projections
Theoretical Computer Science
A fuzzy clustering approach for finding similar documents using a novel similarity measure
Expert Systems with Applications: An International Journal
Inference and evaluation of the multinomial mixture model for text clustering
Information Processing and Management: an International Journal
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A framework for projected clustering of high dimensional data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A heuristic-based fuzzy co-clustering algorithm for categorization of high-dimensional data
Fuzzy Sets and Systems
Structure clustering for Chinese patent documents
Expert Systems with Applications: An International Journal
A new approach on search for similar documents with multiple categories using fuzzy clustering
Expert Systems with Applications: An International Journal
A Latent Semantic Indexing-based approach to multilingual document clustering
Decision Support Systems
Incremental clustering of mixed data based on distance hierarchy
Expert Systems with Applications: An International Journal
Construction of supervised and unsupervised learning systems for multilingual text categorization
Expert Systems with Applications: An International Journal
An attentive self-organizing neural model for text mining
Expert Systems with Applications: An International Journal
Clustering of document collection - A weighting approach
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Using the self organizing map for clustering of text documents
Expert Systems with Applications: An International Journal
Fuzzy ensemble clustering based on random projections for DNA microarray data analysis
Artificial Intelligence in Medicine
The Fuzzy ART algorithm: A categorization method for supplier evaluation and selection
Expert Systems with Applications: An International Journal
Modeling user multiple interests by an improved GCS approach
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A framework for understanding Latent Semantic Indexing (LSI) performance
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Clustering high-dimensional data using growing SOM
ISNN'05 Proceedings of the Second international conference on Advances in neural networks - Volume Part II
Incremental clustering of newsgroup articles
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Comparing dimension reduction techniques for document clustering
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Self organization of a massive document collection
IEEE Transactions on Neural Networks
The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data
IEEE Transactions on Neural Networks
Fast growing self organizing map for text clustering
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
ICE - Intelligent Clustering Engine: A clustering gadget for Google Desktop
Expert Systems with Applications: An International Journal
A new approach for data clustering and visualization using self-organizing maps
Expert Systems with Applications: An International Journal
Supervised kernel self-organizing map
IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Probability-based text clustering algorithm by alternately repeating two operations
Journal of Information Science
A research case study: Difficulties and recommendations when using a textual data mining tool
Information and Management
Hi-index | 12.05 |
The state-of-the-art text clustering methods suffer from the huge size of documents with high-dimensional features. In this paper, we studied fast SOM clustering technology for Text Information. Our focus is on how to enhance the efficiency of text clustering system whereas high clustering qualities are also kept. To achieve this goal, we separate the system into two stages: offline and online. In order to make text clustering system more efficient, feature extraction and semantic quantization are done offline. Although neurons are represented as numerical vectors in high-dimension space, documents are represented as collections of some important keywords, which is different from many related works, thus the requirement for both time and space in the offline stage can be alleviated. Based on this scenario, fast clustering techniques for online stage are proposed including how to project documents onto output layers in SOM, fast similarity computation method and the scheme of Incremental clustering technology for real-time processing, We tested the system using different datasets, the practical performance demonstrate that our approach has been shown to be much superior in clustering efficiency whereas the clustering quality are comparable to traditional methods.