A maximum entropy approach to natural language processing
Computational Linguistics
Maximum entropy models for natural language ambiguity resolution
Maximum entropy models for natural language ambiguity resolution
A maximum entropy approach to identifying sentence boundaries
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Sequential conditional Generalized Iterative Scaling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chunking with maximum entropy models
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Efficient sampling and feature selection in whole sentence maximum entropy language models
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Discriminative Reranking for Natural Language Parsing
Computational Linguistics
A progressive feature selection algorithm for ultra large feature spaces
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Discriminative features in reversible stochastic attribute-value grammars
UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Hi-index | 0.00 |
This paper describes a fast algorithm that selects features for conditional maximum entropy modeling. Berger et al. (1996) presents an incremental feature selection (IFS) algorithm, which computes the approximate gains for all candidate features at each selection stage, and is very time-consuming for any problems with large feature spaces. In this new algorithm, instead, we only compute the approximate gains for the top-ranked features based on the models obtained from previous stages. Experiments on WSJ data in Penn Treebank are conducted to show that the new algorithm greatly speeds up the feature selection process while maintaining the same quality of selected features. One variant of this new algorithm with look-ahead functionality is also tested to further confirm the good quality of the selected features. The new algorithm is easy to implement, and given a feature space of size F, it only uses O(F) more space than the original IFS algorithm.