A maximum entropy approach to natural language processing
Computational Linguistics
Learning to Parse Natural Language with Maximum Entropy Models
Machine Learning - Special issue on natural language learning
Grafting: fast, incremental feature selection by gradient descent in function space
The Journal of Machine Learning Research
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Exploiting auxiliary distributions in stochastic unification-based grammars
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Estimation of stochastic attribute-value grammars using an informative sample
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Precision and recall of machine translation
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Statistical ranking in tactical generation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Stochastic realisation ranking for a free word order language
ENLG '07 Proceedings of the Eleventh European Workshop on Natural Language Generation
Using self-trained bilexical preferences to improve disambiguation accuracy
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Probabilistic models for disambiguation of an HPSG-based chart generator
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Correlating human and automatic evaluation of a German surface realiser
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Discriminative features in reversible stochastic attribute-value grammars
UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Evaluating disaster management knowledge model by using a frequency-based selection technique
PKAW'12 Proceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems
Development and validation of a Disaster Management Metamodel (DMM)
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Fluency rankers are used in modern sentence generation systems to pick sentences that are not just grammatical, but also fluent. It has been shown that feature-based models, such as maximum entropy models, work well for this task. Since maximum entropy models allow for incorporation of arbitrary real-valued features, it is often attractive to create very general feature templates, that create a huge number of features. To select the most discriminative features, feature selection can be applied. In this paper we compare three feature selection methods: frequency-based selection, a generalization of maximum entropy feature selection for ranking tasks with real-valued features, and a new selection method based on feature value correlation. We show that the often-used frequency-based selection performs badly compared to maximum entropy feature selection, and that models with a few hundred well-picked features are competitive to models with no feature selection applied. In the experiments described in this paper, we compressed a model of approximately 490.000 features to 1.000 features.