Investigating Learning Approaches for Blog Post Opinion Retrieval

Authors:
Shima Gerani;Mark J. Carman;Fabio Crestani
Affiliations:
Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland
Venue:
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Year:
2009

Citing 12
Cited 5

Making large-scale support vector machine learning practical

Advances in kernel methods
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Condorcet fusion for improved retrieval

Proceedings of the eleventh international conference on Information and knowledge management
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Minimal test collections for retrieval evaluation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Opinion retrieval from blogs

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Automatic construction of an opinion-term vocabulary for ad hoc retrieval

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

Opinion finding in blogs: a passage-based language modeling approach

RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Sentence-level contextual opinion retrieval

Proceedings of the 20th international conference companion on World wide web
A large-scale sentiment analysis for Yahoo! answers

Proceedings of the fifth ACM international conference on Web search and data mining
Combining relevancy and methodological quality into a single ranking for evidence-based medicine

Information Sciences: an International Journal
Information Retrieval on the Blogosphere

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Blog post opinion retrieval is the problem of identifying posts which express an opinion about a particular topic. Usually the problem is solved using a 3 step process in which relevant posts are first retrieved, then opinion scores are generated for each document, and finally the opinion and relevance scores are combined to produce a single ranking. In this paper, we study the effectiveness of classification and rank learning techniques for solving the blog post opinion retrieval problem. We have chosen not to rely on external lexicons of opinionated terms, but investigate to what extent the list of opinionated terms can be mined from the same corpus of relevance/opionion assessments that are used to train the retrieval system. We compare popular feature selection methods such as the weighted log likelihood ratio and mutual information for use both in selecting terms for training an opinionated document classifier and also as term weights for generating simpler (not learning based) aggregate opinion scores for documents. We thereby analyze what performance gains result from learning in the opinion detection phase. Furthermore we compare different learning and not learning based methods for combining relevance and opinion information in order to generate a ranked list of opinionated posts, thereby investigating the effect of learning on the ranking phase.