A fast algorithm for feature selection in conditional maximum entropy modeling

Authors:
Yaqian Zhou;Lide Wu;Fuliang Weng;Hauke Schmidt
Affiliations:
Fudan University, Shanghai, P.R. China;Fudan University, Shanghai, P.R. China;Robert Bosch Corp., Palo Alto, CA;Robert Bosch Corp., Palo Alto, CA
Venue:
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Year:
2003

Citing 6
Cited 5

A maximum entropy approach to natural language processing

Computational Linguistics
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
A maximum entropy approach to identifying sentence boundaries

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Sequential conditional Generalized Iterative Scaling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chunking with maximum entropy models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Efficient sampling and feature selection in whole sentence maximum entropy language models

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01

Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization

Machine Learning
Discriminative Reranking for Natural Language Parsing

Computational Linguistics
A progressive feature selection algorithm for ultra large feature spaces

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A maximum entropy framework that integrates word dependencies and grammatical relations for reading comprehension

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Discriminative features in reversible stochastic attribute-value grammars

UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a fast algorithm that selects features for conditional maximum entropy modeling. Berger et al. (1996) presents an incremental feature selection (IFS) algorithm, which computes the approximate gains for all candidate features at each selection stage, and is very time-consuming for any problems with large feature spaces. In this new algorithm, instead, we only compute the approximate gains for the top-ranked features based on the models obtained from previous stages. Experiments on WSJ data in Penn Treebank are conducted to show that the new algorithm greatly speeds up the feature selection process while maintaining the same quality of selected features. One variant of this new algorithm with look-ahead functionality is also tested to further confirm the good quality of the selected features. The new algorithm is easy to implement, and given a feature space of size F, it only uses O(F) more space than the original IFS algorithm.