Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Readings in information retrieval
Improved boosting algorithms using confidence-rated predictions
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Data mining with decision trees and decision rules
Future Generation Computer Systems - Special double issue on data mining
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature selection with conditional mutual information maximin in text categorization
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Hi-index | 0.00 |
Most existing text classification methods use the vector space model to represent documents, and the document vectors are evaluated by the TF-IDFmethod. However, TF-IDFweighting does not take into account the fact that the weight of a feature in a document is related not only to the document, but also to the class that document belongs to. In this paper, we present a Clustering-based feature Weighting approach for text Classification, or CWCfor short. CWCtakes each class in the training collection as a known cluster, and searches for feature weights iteratively to optimize the clustering objective function, so the best clustering result is achieved, and documents in different classes can be best distinguished by using the resulting feature weights. Performance of CWCis validated by conducting classification over two real text collections, and experimental results show that CWCoutperforms the traditional KNN.