Implementing agglomerative hierarchic clustering algorithms for use in document retrieval
Information Processing and Management: an International Journal
Parallel programs for the transputer
Parallel programs for the transputer
Adaptation in natural and artificial systems
Adaptation in natural and artificial systems
The retrieval effectiveness of five clustering algorithms as a function of indexing exhaustivity
Journal of the American Society for Information Science
Hierarchic document classification using Ward's clustering method
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A Robust Competitive Clustering Algorithm With Applications in Computer Vision
IEEE Transactions on Pattern Analysis and Machine Intelligence
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Experimentation as a way of life: Okapi at TREC
Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Using LSI for text classification in the presence of background text
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
An Adaptive Flocking Algorithm for Spatial Clustering
PPSN VII Proceedings of the 7th International Conference on Parallel Problem Solving from Nature
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Feature Reduction for Neural Network Based Text Categorization
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Information-theoretical methods in clustering
Information-theoretical methods in clustering
An Intelligent Information System for Organizing Online Text Documents
Knowledge and Information Systems
Hierarchical document categorization with k-NN and concept-based thesauri
Information Processing and Management: an International Journal
A flocking based algorithm for document clustering analysis
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Nature-inspired applications and systems
A Graph-Theoretic Approach to Nonparametric Cluster Analysis
IEEE Transactions on Computers
Expert Systems with Applications: An International Journal
K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 12.05 |
This paper proposes two modified evolutionary computing methods for genetic algorithms (GAs) and proves an effective content-based feature selection approach to improve clustering performance. The conventional GAs suffer from the problem of slow learning and are prone to be trapped into a local minimum due to a high dimensional exploration space. In this paper, we propose a parametric and a nonparametric evolutionary algorithms to properly adjust the operators of GA. In the parametric approach, several fuzzy control parameters are artificially defined to adaptively optimize the GA behaviors. By contrast, they are automatically adjusted by GA itself in the nonparametric approach. Moreover, a content-based feature selection (CFS) approach is demonstrated to create a robust semantic space and reduce the number of dimension which accelerates the speed of evolutionary computing. We take advantage of a parallel computing technology to improve the efficiency of clustering. The experimental results show that our methods enhance the performance of the standard GA and are more efficient than those implemented on a single processor. The CFS approach not only reduces the document dimension, but also indirectly advances clustering efficiency.