Comparison of feature-level learning methods for mining online consumer reviews

Authors:
Li Chen;Luole Qi;Feng Wang
Affiliations:
Department of Computer Science, Hong Kong Baptist University, Hong Kong, China;Department of Computer Science, Hong Kong Baptist University, Hong Kong, China;Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 29
Cited 0

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Information extraction

Communications of the ACM
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Latent dirichlet allocation

The Journal of Machine Learning Research
Term extraction + term clustering: an integrated platform for computer-aided terminology

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Effects of adjective orientation and gradability on sentence subjectivity

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Finding new terminology in very large corpora

Proceedings of the 3rd international conference on Knowledge capture
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Movie review mining and summarization

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
OpinionFinder: a system for subjectivity analysis

HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
Red Opal: product-feature scoring from reviews

Proceedings of the 8th ACM conference on Electronic commerce
The utility of linguistic rules in opinion mining

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A holistic lexicon-based approach to opinion mining

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Query-Based Summarization of Customer Reviews

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web

Management Science
OpinionMiner: a novel machine learning system for web opinion mining and extraction

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Adding redundant features for CRFs-based sentence sentiment classification

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Extracting opinion targets in a single- and cross-domain setting with conditional random fields

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Structure-aware review mining and summarization

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Mining Fine Grained Opinions by Using Probabilistic Models and Domain Knowledge

WI-IAT '10 Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Extracting and ranking product features in opinion documents

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Quantified Score

Hi-index	12.05

Visualization

Abstract

The tasks of feature-level opinion mining usually include the extraction of product entities from consumer reviews, the identification of opinion words that are associated with the entities, and the determining of these opinions' polarities (e.g., positive, negative, or neutral). In recent years, two major approaches have been proposed to determine opinions at the feature level: model based methods such as the one based on lexicalized Hidden Markov Model (L-HMMs), and statistical methods like the association rule mining based technique. However, little work has compared these algorithms regarding their practical abilities in identifying various types of review elements, such as features, opinions, intensifiers, entity phrases and infrequent entities. On the other hand, little attentions has been paid to applying more discriminative learning models to accomplish these opinion mining tasks. In this paper, we not only experimentally compared these methods based on a real-world review dataset, but also in particular adopted the Conditional Random Fields (CRFs) model and evaluated its performance in comparison with related algorithms. Moreover, for CRFs-based mining algorithm, we tested the role of a self-tagging process in two automatic training conditions, and further identified the ideal combination of learning functions to optimize its learning performance. The comparative experiment eventually revealed the CRFs-based method's outperforming accuracy in terms of mining multiple review elements, relative to other methods.