Markov blankets and meta-heuristics search: sentiment extraction from unstructured texts

Authors:
Edoardo Airoldi;Xue Bai;Rema Padman
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;The John Heinz III School of Public Policy and Management, Carnegie Mellon University, Pittsburgh, PA
Venue:
WebKDD'04 Proceedings of the 6th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
Year:
2004

Citing 18
Cited 5

Evaluating text categorization

HLT '91 Proceedings of the workshop on Speech and Natural Language
Assessing agreement on classification tasks: the kappa statistic

Computational Linguistics
Causality: models, reasoning, and inference

Causality: models, reasoning, and inference
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Measuring lift quality in database marketing

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A statistical learning learning model of text classification for support vector machines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning

Machine Learning
Tabu Search

Tabu Search
A model of textual affect sensing using real-world knowledge

Proceedings of the 8th international conference on Intelligent user interfaces
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Feature selection for high-dimensional genomic microarray data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Predicting the semantic orientation of adjectives

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Large-sample learning of bayesian networks is NP-hard

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Predicting consumer sentiments from online text

Decision Support Systems
Deciphering word-of-mouth in social media: Text-based metrics of consumer reviews

ACM Transactions on Management Information Systems (TMIS)
Product Comparison Networks for Competitive Analysis of Online Word-of-Mouth

ACM Transactions on Management Information Systems (TMIS)
Revised mutual information approach for german text sentiment classification

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting sentiments from unstructured text has emerged as an important problem in many disciplines. An accurate method would enable us, for example, to mine online opinions from the Internet and learn customers’ preferences for economic or marketing research, or for leveraging a strategic advantage. In this paper, we propose a two-stage Bayesian algorithm that is able to capture the dependencies among words, and, at the same time, finds a vocabulary that is efficient for the purpose of extracting sentiments. Experimental results on online movie reviews and online news show that our algorithm is able to select a parsimonious feature set with substantially fewer predictor variables than in the full data set and leads to better predictions about sentiment orientations than several state-of-the-art machine learning methods. Our findings suggest that sentiments are captured by conditional dependence relations among words, rather than by keywords or high-frequency words.