Elements of information theory
Elements of information theory
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Learning and Revising User Profiles: The Identification ofInteresting Web Sites
Machine Learning - Special issue on multistrategy learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Effective Methods for Improving Naive Bayes Text Classifiers
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
A divisive information theoretic feature clustering algorithm for text classification
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Distribution of content words and phrases in text and language modelling
Natural Language Engineering
Raising the baseline for high-precision text classifiers
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Text classification: a recent overview
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Exploring hedge identification in biomedical literature
Journal of Biomedical Informatics
The ineffectiveness of within-document term frequency in text classification
Information Retrieval
Use of Medical Subject Headings (MeSH) in Portuguese for categorizing web-based healthcare content
Journal of Biomedical Informatics
A technique for improving the performance of naive bayes text classification
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Identifying historical period and ethnic origin of documents using stylistic feature sets
DS'06 Proceedings of the 9th international conference on Discovery Science
On text mining algorithms for automated maintenance of hierarchical knowledge directory
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Control-flow integrity principles, implementations, and applications
ACM Transactions on Information and System Security (TISSEC)
Hi-index | 0.00 |
Naive Bayes is often used in text classification applications and experiments because of its simplicity and effectiveness. However, its performance is often degraded because it does not model text well, and by inappropriate feature selection and the lack of reliable confidence scores. We address these problems and show that they can be solved by some simple corrections. We demonstrate that our simple modifications are able to improve the performance of Naive Bayes for text classification significantly.