Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Term Weighting Approaches in Automatic Text Retrieval
Term Weighting Approaches in Automatic Text Retrieval
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Some Effective Techniques for Naive Bayes Text Classification
IEEE Transactions on Knowledge and Data Engineering
Using ambiguity measure feature selection algorithm for support vector machine classifier
Proceedings of the 2008 ACM symposium on Applied computing
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
An examination of feature selection frameworks in text categorization
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Macro features based text categorization
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
A high performance centroid-based classification approach for language identification
Pattern Recognition Letters
Comparison of text feature selection policies and using an adaptive framework
Expert Systems with Applications: An International Journal
A model for mining material properties for radiation shielding
Integrated Computer-Aided Engineering
Hi-index | 0.02 |
This paper proposes a local feature selection (FS) measure namely, Categorical Descriptor Term (CTD) for text categorization. It is derived based on classic term weighting scheme, TFIDF. The method explicitly chooses feature set for each category by only selecting set of terms from relevant category. Although past literatures have suggested that the use of features from irrelevant categories can improve the measure of text categorization, we believe that by incorporating only relevant feature can be highly effective. The experimental comparison is carried out between CTD and five well-known feature selection measures: Information Gain, Chi-Square, Correlation Coefficient, Odd Ratio and GSS Coefficient. The results also show that our proposed method can perform comparatively well with other FS measures, especially on collection with highly overlapped topics.