Mixtures of biased sentiment analysers

Authors:
Michael Salter-Townshend;Thomas Brendan Murphy
Affiliations:
School of Mathematical Sciences and Complex and Adaptive Systems Laboratory, University College Dublin, Dublin 4, Ireland;School of Mathematical Sciences and Complex and Adaptive Systems Laboratory, University College Dublin, Dublin 4, Ireland
Venue:
Advances in Data Analysis and Classification
Year:
2014

Citing 14
Cited 0

The statistical analysis of compositional data

The statistical analysis of compositional data
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
An improvement of the NEC criterion for assessing the number of clusters in a mixture model

Non-Linear Analysis
An EM Algorithm for the Block Mixture Model

IEEE Transactions on Pattern Analysis and Machine Intelligence
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A probabilistic framework for relational clustering

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Block clustering with Bernoulli mixture models: Comparison of different approaches

Computational Statistics & Data Analysis
Parsimonious Gaussian mixture models

Statistics and Computing
Bayesian Co-clustering

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Get out the vote: determining support or opposition from congressional floor-debate transcripts

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning From Crowds

The Journal of Machine Learning Research
Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Twitter Mood as a Stock Market Predictor

Computer
Eliminating spammers and ranking annotators for crowdsourced labeling tasks

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modelling bias is an important consideration when dealing with inexpert annotations. We are concerned with training a classifier to perform sentiment analysis on news media articles, some of which have been manually annotated by volunteers. The classifier is trained on the words in the articles and then applied to non-annotated articles. In previous work we found that a joint estimation of the annotator biases and the classifier parameters performed better than estimation of the biases followed by training of the classifier. An important question follows from this result: can the annotators be usefully clustered into either predetermined or data-driven clusters, based on their biases? If so, such a clustering could be used to select, drop or otherwise categorise the annotators in a crowdsourcing task. This paper presents work on fitting a finite mixture model to the annotators' bias. We develop a model and an algorithm and demonstrate its properties on simulated data. We then demonstrate the clustering that exists in our motivating dataset, namely the analysis of potentially economically relevant news articles from Irish online news sources.