Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Elements of information theory
Elements of information theory
C4.5: programs for machine learning
C4.5: programs for machine learning
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Using Decision Trees to Construct a Practical Parser
Machine Learning - Special issue on natural language learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Training Support Vector Machines: an Application to Face Detection
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text Categorization Using Transductive Boosting
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Multiclass text categorization for automated survey coding
Proceedings of the 2003 ACM symposium on Applied computing
Automating survey coding by multiclass text categorization techniques
Journal of the American Society for Information Science and Technology
Predicting library of congress classifications from library of congress subject headings
Journal of the American Society for Information Science and Technology
Extracting word sequence correspondences with support vector machines
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
SVM answer selection for open-domain question answering
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Investigation into Biomedical Literature Classification Using Support Vector Machines
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Use of support vector learning for chunk identification
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Japanese dependency structure analysis based on support vector machines
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Question classification using HDAG kernel
MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
MATH'07 Proceedings of the 12th WSEAS International Conference on Applied Mathematics
Conversion of Japanese passive/causative sentences into active sentences using machine learning
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Cascaded feature selection in SVMs text categorization
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Text and hypertext categorization
Artificial intelligence
ACM Transactions on Asian Language Information Processing (TALIP)
Feature selection in SVM based on the hybrid of enhanced genetic algorithm and mutual information
MDAI'06 Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence
A supervised approach for gene mention detection
SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part I
Expert Systems with Applications: An International Journal
Document-level sentiment classification: An empirical comparison between SVM and ANN
Expert Systems with Applications: An International Journal
Feature words that classify problem sentence in scientific article
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Data & Knowledge Engineering
Using micro-documents for feature selection: The case of ordinal text classification
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
This paper investigates the effect of prior feature selection in Support Vector Machine (SVM) text categorization. The input space was gradually increased by using mutual information (MI) filtering and part-of-speech (POS) filtering, which determine the portion of words that are appropriate for learning from the information-theoretic and the linguistic perspectives, respectively. We tested the two filtering methods on SVMs as well as a decision tree algorithm C4.5. The SVMs' results common to both filtering are that 1) the optimal number of features differed completely across categories, and 2) the average performance for all categories was best when all of the words were used. In addition, a comparison of the two filtering methods clarified that POS filtering on SVMs consistently outperformed MI filtering, which indicates that SVMs cannot find irrelevant parts of speech. These results suggest a simple strategy for the SVM text categorization: use a full number of words found through a rough filtering technique like part-of-speech tagging.