Text compression
Learning and Revising User Profiles: The Identification ofInteresting Web Sites
Machine Learning - Special issue on multistrategy learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning - Special issue on learning with probabilistic representations
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Employing multiple representations for Chinese information retrieval
Journal of the American Society for Information Science
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Statistical phrases for vector-space information retrieval (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Engineering for Text Classification
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Automatic text categorization in terms of genre and author
Computational Linguistics
Combining email models for false positive reduction
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Text classification in Asian languages without word segmentation
AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Visual language modeling for image classification
Proceedings of the international workshop on Workshop on multimedia information retrieval
A Language Modelling Approach to Linking Criminal Styles with Offender Characteristics
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Accelerating Web Content Filtering by the Early Decision Algorithm
IEICE - Transactions on Information and Systems
Wikipedia-based semantic interpretation for natural language processing
Journal of Artificial Intelligence Research
A Language Modelling approach to linking criminal styles with offender characteristics
Data & Knowledge Engineering
Session boundary detection for association rule learning using n-gram language models
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Mining police digital archives to link criminal styles with offender characteristics
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
A comparison of data preparation approaches for e-mail categorisation
International Journal of Intelligent Information and Database Systems
Recursive data mining for role identification in electronic communications
International Journal of Hybrid Intelligent Systems
An approach to indexing and clustering news stories using continuous language models
NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
A logistic regression-based smoothing method for Chinese text categorization
Expert Systems with Applications: An International Journal
Annotated stochastic context free grammars for analysis and synthesis of proteins
EvoBIO'11 Proceedings of the 9th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
Removing smoothing from naive bayes text classifier
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Automatic chinese text classification using n-gram model
ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part III
A comparison of text-categorization methods applied to n-gram frequency statistics
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
On compression-based text classification
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
A term association translation model for naive bayes text classification
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Reduction of training noises for text classifiers
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
High throughput filtering using FPGA-acceleration
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Classifying the socio-situational settings of transcripts of spoken discourses
Speech Communication
Integrated instance- and class-based generative modeling for text classification
Proceedings of the 18th Australasian Document Computing Symposium
Utilizing global and path information with language modelling for hierarchical text classification
Journal of Information Science
Hi-index | 0.00 |
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. The chain augmented naive Bayes classifiers we propose have two advantages over standard naive Bayes classifiers. First, a chain augmented naive Bayes model relaxes some of the independence assumptions of naive Bayes--allowing a local Markov chain dependence in the observed variables--while still permitting efficient inference and learning. Second, smoothing techniques from statistical language modeling can be used to recover better estimates than the Laplace smoothing techniques usually used in naive Bayes classification. Our experimental results on three real world data sets show that we achieve substantial improvements over standard naive Bayes classification, while also achieving state of the art performance that competes with the best known methods in these cases.