Towards multimodal sentiment analysis: harvesting opinions from the web

Authors:
Louis-Philippe Morency;Rada Mihalcea;Payal Doshi
Affiliations:
Institute for Creative Technologies / University of Southern California, Los Angeles, CA, USA;University of North Texas, Denton, TX, USA;University of Southern California, Los Angeles, CA, USA
Venue:
ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Year:
2011

Citing 15
Cited 5

Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Emotion Recognition Based on Joint Visual and Audio Cues

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Word sense and subjectivity

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Emotions from text: machine learning for text-based emotion prediction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Topic segmentation of dialogue

ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
Why are they excited?: identifying and explaining spikes in blog mood levels

EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Multimodal subjectivity analysis of multiparty conversation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
SemEval-2007 task 14: affective text

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Markov models for offline handwriting recognition: a survey

International Journal on Document Analysis and Recognition
Lexicon-based methods for sentiment analysis

Computational Linguistics
Creating subjective and objective sentence classifiers from unannotated texts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study

IEEE Transactions on Audio, Speech, and Language Processing

Towards sensing the influence of visual narratives on human affect

Proceedings of the 14th ACM international conference on Multimodal interaction
Towards multimodal deception detection -- step 1: building a collection of deceptive videos

Proceedings of the 14th ACM international conference on Multimodal interaction
Negative sentiment in scenarios elicit pupil dilation response: an auditory study

Proceedings of the 14th ACM international conference on Multimodal interaction
A crowdsourcing platform for the construction of accessibility maps

Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility
Inferring mood in ubiquitous conversational video

Proceedings of the 12th International Conference on Mobile and Ubiquitous Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

With more than 10,000 new videos posted online every day on social websites such as YouTube and Facebook, the internet is becoming an almost infinite source of information. One crucial challenge for the coming decade is to be able to harvest relevant information from this constant flow of multimodal data. This paper addresses the task of multimodal sentiment analysis, and conducts proof-of-concept experiments that demonstrate that a joint model that integrates visual, audio, and textual features can be effectively used to identify sentiment in Web videos. This paper makes three important contributions. First, it addresses for the first time the task of tri-modal sentiment analysis, and shows that it is a feasible task that can benefit from the joint exploitation of visual, audio and textual modalities. Second, it identifies a subset of audio-visual features relevant to sentiment analysis and present guidelines on how to integrate these features. Finally, it introduces a new dataset consisting of real online data, which will be useful for future research in this area.