A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluation and extension of maximum entropy models with inequality constraints
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Smoothing methods in maximum entropy language modeling
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Hi-index | 0.00 |
This paper proposes the use of non-extensive entropy for text classification. Non-extensive entropy technique is used for text classification by estimating the conditional distribution of the class variable given the document. The underlying principle of non-extensive entropy is that without external knowledge, one should prefer distributions that are uniform. This paper proposes two models for text classification based on maximum entropy principle. The first model extends Shannon entropy into non-extensive entropy to simplify the form of classifier, the other one introduces high-level constraints into non-extensive model to impose constraints on the pairs of entities. Model with high-level constraints constructs relations between word pairs which builds semantic constraints, for the sake of advancing accuracy of text classification. Experiments on the 20 newsgroup set demonstrate the advantage of non-extensive model and non-extensive model with high-level constraints.