Assembling the optimal sentiment classifiers

Authors:
Yuming Lin;Xiaoling Wang;Jingwei Zhang;Aoying Zhou
Affiliations:
Institute of Massive Computing, East China Normal University, Shanghai, China;Institute of Massive Computing, East China Normal University, Shanghai, China;Institute of Massive Computing, East China Normal University, Shanghai, China;Institute of Massive Computing, East China Normal University, Shanghai, China
Venue:
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Year:
2012

Citing 15
Cited 0

A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparison of Stacking with Meta Decision Trees to Bagging, Boosting, and Stacking with other Methods

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Is Combining Classifiers with Stacking Better than Selecting the Best One?

Machine Learning
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Emotions from text: machine learning for text-based emotion prediction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving binary classification on text problems using differential word features

Proceedings of the 18th ACM conference on Information and knowledge management
Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Identifying text polarity using random walks

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Sentiment learning on product reviews via sentiment ontology tree

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A study of information retrieval weighting schemes for sentiment analysis

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
User-level sentiment analysis incorporating social networks

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Sentiment analysis of Twitter data

LSM '11 Proceedings of the Workshop on Languages in Social Media
Sentiment classification using word sub-sequences and dependency sub-trees

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An information theoretic approach to sentiment polarity classification

Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Sentiment classification via integrating multiple feature presentations

Proceedings of the 21st international conference companion on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sentiment classification aims to classify documents according to their overall sentiment orientation, which plays an important role in many web applications, such as electronic commerce. Machine learning is an effective method for such tasks. In general, a classifier is determined by a feature type, a weighting function and a classification algorithm for a given training set. Thus, users are required to predetermine which ones should be applied, that is a troublesome problem for them, because each classifier always achieves different performance for different domains. To deal with this problem, we develop a three phase framework based on assembling multiple classifiers. In order to choose the optimal combination of classifiers, we propose a criterion for estimating the quality of the combination based on sentiment classification accuracy and diversity of the results generated by these classifiers. Moreover, we study the effect of the number of classifiers selected experimentally. With our solution, users can achieve a good performance without making a choice among plentiful combinations of different classifiers. We perform extensive experiments to demonstrate the effectiveness of our solution for different domains.