The nature of statistical learning theory
The nature of statistical learning theory
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Unbalanced Class Distribution and Naive Bayes
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Experiments on the Use of Feature Selection and Negative Evidence in Automated Text Categorization
ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
On Machine Learning Methods for Chinese Document Categorization
Applied Intelligence
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
A Feature Selection Framework for Text Filtering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Arabic morphology generation using a concatenative strategy
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Arabic finite-state morphological analysis and generation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Building a shallow Arabic Morphological Analyzer in one day
SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
A novel feature selection algorithm for text categorization
Expert Systems with Applications: An International Journal
Acoustic model and pronunciation adaptation in automatic speech recognition
Acoustic model and pronunciation adaptation in automatic speech recognition
Performance of KNN and SVM classifiers on full word Arabic articles
Advanced Engineering Informatics
BNS feature scaling: an improved representation over tf-idf for svm text classification
Proceedings of the 17th ACM conference on Information and knowledge management
Feature selection with a measure of deviations from Poisson in text categorization
Expert Systems with Applications: An International Journal
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
A comparison of text-classification techniques applied to Arabic text
Journal of the American Society for Information Science and Technology
Automatic Arabic document categorization based on the Naïve Bayes algorithm
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Information Extraction: Algorithms and Prospects in a Retrieval Context
Information Extraction: Algorithms and Prospects in a Retrieval Context
Hi-index | 0.10 |
Feature sub-set selection (FSS) is an important step for effective text classification (TC) systems. This paper presents an empirical comparison of seventeen traditional FSS metrics for TC tasks. The TC is restricted to support vector machine (SVM) classifier and only for Arabic articles. Evaluation used a corpus that consists of 7842 documents independently classified into ten categories. The experimental results are presented in terms of macro-averaging precision, macro-averaging recall and macro-averaging F"1 measures. Results reveal that Chi-square and Fallout FSS metrics work best for Arabic TC tasks.