An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
Using corpus statistics to remove redundant words in text categorization
Journal of the American Society for Information Science
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hi-index | 0.00 |
In this paper, we address the problem of dealing with a large collection of data and propose a method for text classification which manipulates data using two well-known machine learning techniques, Naive Bayes(NB) and Support Vector Machines(SVMs). NB is based on the assumption of word independence in a text, which makes the computation of it far more efficient. SVMs, on the other hand, have the potential to handle large feature spaces, which makes it possible to produce better performance. The training data for SVMs are extracted using NB classifiers according to the category hierarchies, which makes it possible to reduce the amount of computation necessary for classification without sacrificing accuracy.