A domain independent framework to extract and aggregate analogous features in online reviews

Authors:
Archana Bhattarai;Nobal Niraula;Vasile Rus;King-Ip Lin
Affiliations:
Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Year:
2012

Citing 18
Cited 0

Using latent semantic analysis to improve access to textual information

CHI '88 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
WordNet: a lexical database for English

Communications of the ACM
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Learning Subjective Adjectives from Corpora

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Mining product reputations on the Web

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Latent dirichlet allocation

The Journal of Machine Learning Research
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Determining the semantic orientation of terms through gloss classification

Proceedings of the 14th ACM international conference on Information and knowledge management
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Movie review mining and summarization

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Rated aspect summarization of short comments

Proceedings of the 18th international conference on World wide web
Mining opinion features in customer reviews

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Sentiment retrieval using generative models

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
An unsupervised aspect-sentiment model for online reviews

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Opinion word expansion and target extraction through double propagation

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting and detecting features from online reviews is both important and challenging, especially when domain knowledge is not explicitly available. Moreover, opinions about the same feature of a product or service are frequently expressed in various lexical forms. In this paper, we present a novel framework to automatically detect, extract and aggregate semantically related features of reviewed products and services. Our model uses sentence level syntactic and lexical information to detect candidate feature words, and corpus level co-occurrence statistics to perform grouping of semantically similar features to obtain high precision feature detection. The high precision feature assembly capability of our model has a distinct advantage over state of the art approaches, like double propagation, by producing short and succinct sets of features compared to potential thousands of features that are generated by existing approaches. We evaluate our model in two completely unrelated domains, restaurant and camera online reviews, to verify its domain independence. The results of our model outperformed existing state of the art probabilistic models.