Representation and learning in information retrieval
Representation and learning in information retrieval
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Subset Selection in Text-Learning
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An Empirical Study of Feature Selection for Text Categorization based on Term Weightage
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Automatic syllabus classification
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Raising the baseline for high-precision text classifiers
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating tags in a semantic content-based recommender
Proceedings of the 2008 ACM conference on Recommender systems
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
An Empirical Study of Category Skew on Feature Selection for Text Categorization
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Simultaneous Product Attribute Name and Value Extraction from Web Pages
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
A Software System for Topic Extraction and Document Classification
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Intelligent steganalytic system: application on natural language environment
WSEAS Transactions on Systems and Control
Expert Systems with Applications: An International Journal
A brief survey on sequence classification
ACM SIGKDD Explorations Newsletter
A granular agent evolutionary algorithm for classification
Applied Soft Computing
A technique for improving the performance of naive bayes text classification
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Learning feature-projection based classifiers
Expert Systems with Applications: An International Journal
A folksonomy-based recommender system for personalized access to digital artworks
Journal on Computing and Cultural Heritage (JOCCH)
Machine learning in building a collection of computer science course syllabi
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
The Effect of Stemming on Arabic Text Classification: An Empirical Study
International Journal of Information Retrieval Research
Categorical proportional difference: a feature selection method for text categorization
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
Building a search engine for computer science course syllabi
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Information Technology and Management
Data Mining and Knowledge Discovery
Hi-index | 0.01 |
While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem. Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain. In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method. While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM.