C4.5: programs for machine learning
C4.5: programs for machine learning
The nature of statistical learning theory
The nature of statistical learning theory
Little words can make a big difference for text classification
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Communications of the ACM
Automatic Indexing: An Experimental Inquiry
Journal of the ACM (JACM)
Text filtering by boosting naive Bayes classifiers
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Automatic Text Categorization: Case Study
SBRN '02 Proceedings of the VII Brazilian Symposium on Neural Networks (SBRN'02)
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Using the feature projection technique based on a normalized voting method for text classification
Information Processing and Management: an International Journal
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Journal of the American Society for Information Science and Technology
Competitive generative models with structure learning for NLP classification tasks
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Weighted proportional k-interval discretization for naive-Bayes classifiers
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Techniques for improving the performance of naive bayes for text classification
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
Text classification presents challenges due to the large number of features, their dependencies, and the large number of training documents. In this research, we investigate whether the use of words as features is appropriate for classification of documents to the ethnic group of their authors and/or to the historical period when they were written. To the best of our knowledge, these kinds of classifications have not been explored before by others. In addition, we investigate Forman's (2003) claim about not using common words for classification tasks. The application domain was articles referring to Jewish law written in Hebrew-Aramaic, which have been little studied. Different experiments using SVM and InfoGain present highly successful results (more than 95%). The results indicate that the use of common words as features contribute to make the learning task efficient and more accurate.