Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory
The nature of statistical learning theory
A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Making large-scale support vector machine learning practical
Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Engineering for Text Classification
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
Exploiting auxiliary distributions in stochastic unification-based grammars
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Estimators for stochastic "Unification-Based" grammars
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Investigating GIS and smoothing for maximum entropy taggers
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Evaluation and extension of maximum entropy models with inequality constraints
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A fast algorithm for feature selection in conditional maximum entropy modeling
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Maximum entropy estimation for feature forests
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Efficiently inducing features of conditional random fields
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
A study of local and global thresholding techniques in text categorization
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Privacy-MaxEnt: integrating background knowledge in privacy quantification
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Latent semantic analysis for text categorization using neural network
Knowledge-Based Systems
An automatically constructed thesaurus for neural network based document categorization
Expert Systems with Applications: An International Journal
Clinical text classification under the Open and Closed Topic Assumptions
International Journal of Data Mining and Bioinformatics
Using continuous features in the maximum entropy model
Pattern Recognition Letters
Polynomial to linear: efficient classification with conjunctive features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Hard constraints for grammatical function labelling
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Informed ways of improving data-driven dependency parsing for German
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Hi-index | 0.00 |
Data sparseness or overfitting is a serious problem in natural language processing employing machine learning methods. This is still true even for the maximum entropy (ME) method, whose flexible modeling capability has alleviated data sparseness more successfully than the other probabilistic models in many NLP tasks. Although we usually estimate the model so that it completely satisfies the equality constraints on feature expectations with the ME method, complete satisfaction leads to undesirable overfitting, especially for sparse features, since the constraints derived from a limited amount of training data are always uncertain. To control overfitting in ME estimation, we propose the use of box-type inequality constraints, where equality can be violated up to certain predefined levels that reflect this uncertainty. The derived models, inequality ME models, in effect have regularized estimation with L1 norm penalties of bounded parameters. Most importantly, this regularized estimation enables the model parameters to become sparse. This can be thought of as automatic feature selection, which is expected to improve generalization performance further. We evaluate the inequality ME models on text categorization datasets, and demonstrate their advantages over standard ME estimation, similarly motivated Gaussian MAP estimation of ME models, and support vector machines (SVMs), which are one of the state-of-the-art methods for text categorization.