A statistical learning learning model of text classification for support vector machines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A support vector method for multivariate performance measures
ICML '05 Proceedings of the 22nd international conference on Machine learning
Naive Bayes models for probability estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
Efficient Text Classification by Weighted Proximal SVM
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Introduction to Information Retrieval
Introduction to Information Retrieval
Transferring naive bayes classifiers for text classification
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Hi-index | 0.00 |
This paper compares statistical techniques for text classification using Naïve Bayes and Support Vector Machines, in context of Urdu language. A large corpus is used for training and testing purpose of the classifiers. However, those classifiers cannot directly interpret the raw dataset, so language specific preprocessing techniques are applied on it to generate a standardized and reduced-feature lexicon. Urdu language is morphological rich language which makes those tasks complex. Statistical characteristics of corpus and lexicon are measured which show satisfactory results of text preprocessing module. The empirical results show that Support Vector Machines outperform Naïve Bayes classifier in terms of classification accuracy.