Constructing maximum entropy language models for movie review subjectivity analysis

Authors:
Bo Chen;Hui He;Jun Guo
Affiliations:
Pattern Recognition and Intelligent System Laboratory, School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing, China;Pattern Recognition and Intelligent System Laboratory, School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing, China;Pattern Recognition and Intelligent System Laboratory, School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing, China
Venue:
Journal of Computer Science and Technology
Year:
2008

Citing 10
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Machine learning in automated text categorisation

Machine learning in automated text categorisation
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Evaluation and extension of maximum entropy models with inequality constraints

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Language Feature Mining for Document Subjectivity Analysis

ISDPE '07 Proceedings of the The First International Symposium on Data, Privacy, and E-Commerce

Emotion Recognition of Pop Music Based on Maximum Entropy with Priors

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Document subjectivity analysis has become an important aspect of web text content mining. This problem is similar to traditional text categorization, thus many related classification techniques can be adapted here. However, there is one significant difference that more language or semantic information is required for better estimating the subjectivity of a document. Therefore, in this paper, our focuses are mainly on two aspects. One is how to extract useful and meaningful language features, and the other is how to construct appropriate language models efficiently for this special task. For the first issue, we conduct a Global-Filtering and Local-Weighting strategy to select and evaluate language features in a series of n-grams with different orders and within various distance-windows. For the second issue, we adopt Maximum Entropy (MaxEnt) modeling methods to construct our language model framework. Besides the classical MaxEnt models, we have also constructed two kinds of improved models with Gaussian and exponential priors respectively. Detailed experiments given in this paper show that with well selected and weighted language features, MaxEnt models with exponential priors are significantly more suitable for the text subjectivity analysis task.