Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
Prototype Selection Via Prototype Relevance
CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
A Self-enriching Methodology for Clustering Narrow Domain Short Texts
The Computer Journal
Clustering abstracts of scientific texts using the transition point technique
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Enhanced centroid-based classification technique by filtering outliers
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Cluster-based instance selection for machine classification
Knowledge and Information Systems
Hi-index | 0.00 |
The paper proposes the use of the Silhouette Coefficient (SC) as a ranking measure to perform instance selection in text classification. Our selection criterion was to keep instances with mid-range SC values while removing the instances with high and low SC values. We evaluated our hypothesis across three well-known datasets and various machine learning algorithms. The results show that our method helps to achieve the best trade-off between classification accuracy and training time.