Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Making large-scale support vector machine learning practical
Advances in kernel methods
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
High-performing feature selection for text classification
Proceedings of the eleventh international conference on Information and knowledge management
Feature selection on hierarchy of web documents
Decision Support Systems - Web retrieval and mining
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Implementation of the SMART Information Retrieval System
Implementation of the SMART Information Retrieval System
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Supervised term weighting for automated text categorization
Proceedings of the 2003 ACM symposium on Applied computing
Web page feature selection and classification using neural networks
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
Text categorization with class-based and corpus-based keyword selection
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Nonlinear transformation of term frequencies for term weighting in text categorization
Engineering Applications of Artificial Intelligence
An empirical study on various text classifiers
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Hi-index | 0.10 |
An analytical evaluation of six widely used term weighting techniques for text categorization is presented. The analysis depends on expressing the term weights using term occurrence probabilities in positive and negative categories. The weighting behaviors of the schemes considered are firstly clarified by analyzing the relation between the occurrence probabilities of terms which receive equal weights. Then, the weights are expressed in terms of ratio and difference of term occurrence probabilities where the similarities and differences among different schemes are revealed. Simulations show that the relative performance of different schemes can be explained by the ways they use ratio and difference of term occurrence probabilities in generating the term weights.