Weakly supervised techniques for domain-independent sentiment classification

Authors:
Jonathon Read;John Carroll
Affiliations:
University of Sussex, Falmer, Brighton, United Kingdom;University of Sussex, Falmer, Brighton, United Kingdom
Venue:
Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
Year:
2009

Citing 15
Cited 4

Word association norms, mutual information, and lexicography

Computational Linguistics
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning Subjective Language

Computational Linguistics
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics
Improvements in automatic thesaurus extraction

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Dependency-Based Construction of Semantic Space Models

Computational Linguistics
SemEval-2007 task 14: affective text

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Using emoticons to reduce dependency in machine learning techniques for sentiment classification

ACLstudent '05 Proceedings of the ACL Student Research Workshop

Self-training from labeled features for sentiment analysis

Information Processing and Management: an International Journal
Incorporating Sentiment Prior Knowledge for Weakly Supervised Sentiment Analysis

ACM Transactions on Asian Language Information Processing (TALIP)
Survey on mining subjective data on the web

Data Mining and Knowledge Discovery
Mining co-occurrence matrices for SO-PMI paradigm word candidates

EACL '12 Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important sub-task of sentiment analysis is polarity classification, in which text is classified as being positive or negative. Supervised machine learning techniques can perform this task very effectively. However, they require a large corpus of training data, and a number of studies have demonstrated that the good performance of supervised models is dependent on a good match between the training and testing data with respect to the domain, topic and time-period. Weakly-supervised techniques use a large collection of unlabelled text to determine sentiment, and so their performance may be less dependent on the domain, topic and time-period represented by the testing data. This paper presents experiments that investigate the effectiveness of word similarity techniques when performing weakly-supervised sentiment classification. It also considers the extent to which the performance of each method is independent from the domain, topic and time-period of the testing data. The results indicate that the word similarity techniques are suitable for applications that require sentiment classification across several domains.