Elements of information theory
Elements of information theory
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Control-Sensitive Feature Selection for Lazy Learners
Artificial Intelligence Review - Special issue on lazy learning
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning - Special issue on learning with probabilistic representations
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Robust Classification for Imprecise Environments
Machine Learning
Collection statistics for fast duplicate document detection
ACM Transactions on Information Systems (TOIS)
Probabilistic combination of text classifiers using reliability indicators: models and results
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Unbalanced Class Distribution and Naive Bayes
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Adjusted Probability Naive Bayesian Induction
AI '98 Selected papers from the 11th Australian Joint Conference on Artificial Intelligence on Advanced Topics in Artificial Intelligence
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Using asymmetric distributions to improve text classifier probability estimates
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Feature selection using linear classifier weights: interaction with classification models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic score estimation with piecewise logistic regression
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Document preprocessing for naive Bayes classification and clustering with mixture of multinomials
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Transductive reliability estimation for medical diagnosis
Artificial Intelligence in Medicine
Raising the baseline for high-precision text classifiers
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards cost-sensitive learning for real-world applications
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Hi-index | 0.00 |
In applications of data mining characterized by highly skewed misclassification costs certain types of errors become virtually unacceptable. This limits the utility of a classifier to a range in which such constraints can be met. Naive Bayes, which has proven to be very useful in text mining applications due to high scalability, can be particularly affected. Although its 0/1 loss tends to be small, its misclassifications are often made with apparently high confidence. Aside from efforts to better calibrate Naive Bayes scores, it has been shown that its accuracy depends on document sparsity and feature selection can lead to marked improvement in classification performance. Traditionally, sparsity is controlled globally, and the result for any particular document may vary. In this work we examine the merits of local sparsity control for Naive Bayes in the context of highly asymmetric misclassification costs. In experiments with three benchmark document collections we demonstrate clear advantages of document-level feature selection. In the extreme cost setting, multinomial Naive Bayes with local sparsity control is able to outperform even some of the recently proposed effective improvements to the Naive Bayes classifier. There are also indications that local feature selection may be preferable in different cost settings.