Combining naive bayes and n-gram language models for text classification

Authors:
Fuchun Peng;Dale Schuurmans
Affiliations:
School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada;School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
Venue:
ECIR'03 Proceedings of the 25th European conference on IR research
Year:
2003

Citing 13
Cited 24

Text compression

Text compression
Learning and Revising User Profiles: The Identification ofInteresting Web Sites

Machine Learning - Special issue on multistrategy learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Employing multiple representations for Chinese information retrieval

Journal of the American Society for Information Science
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Statistical phrases for vector-space information retrieval (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Engineering for Text Classification

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Automatic text categorization in terms of genre and author

Computational Linguistics

Combining email models for false positive reduction

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Text classification in Asian languages without word segmentation

AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Visual language modeling for image classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
A Language Modelling Approach to Linking Criminal Styles with Offender Characteristics

NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Accelerating Web Content Filtering by the Early Decision Algorithm

IEICE - Transactions on Information and Systems
Wikipedia-based semantic interpretation for natural language processing

Journal of Artificial Intelligence Research
A Language Modelling approach to linking criminal styles with offender characteristics

Data & Knowledge Engineering
Session boundary detection for association rule learning using n-gram language models

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Mining police digital archives to link criminal styles with offender characteristics

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
A comparison of data preparation approaches for e-mail categorisation

International Journal of Intelligent Information and Database Systems
Recursive data mining for role identification in electronic communications

International Journal of Hybrid Intelligent Systems
An approach to indexing and clustering news stories using continuous language models

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
A logistic regression-based smoothing method for Chinese text categorization

Expert Systems with Applications: An International Journal
Annotated stochastic context free grammars for analysis and synthesis of proteins

EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Removing smoothing from naive bayes text classifier

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Automatic chinese text classification using n-gram model

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part III
A comparison of text-categorization methods applied to n-gram frequency statistics

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
On compression-based text classification

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A term association translation model for naive bayes text classification

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Reduction of training noises for text classifiers

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
High throughput filtering using FPGA-acceleration

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Classifying the socio-situational settings of transcripts of spoken discourses

Speech Communication
Integrated instance- and class-based generative modeling for text classification

Proceedings of the 18th Australasian Document Computing Symposium
Utilizing global and path information with language modelling for hierarchical text classification

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. The chain augmented naive Bayes classifiers we propose have two advantages over standard naive Bayes classifiers. First, a chain augmented naive Bayes model relaxes some of the independence assumptions of naive Bayes--allowing a local Markov chain dependence in the observed variables--while still permitting efficient inference and learning. Second, smoothing techniques from statistical language modeling can be used to recover better estimates than the Laplace smoothing techniques usually used in naive Bayes classification. Our experimental results on three real world data sets show that we achieve substantial improvements over standard naive Bayes classification, while also achieving state of the art performance that competes with the best known methods in these cases.