Efficient confident search in large review corpora

Authors:
Theodoros Lappas;Dimitrios Gunopulos
Affiliations:
UC Riverside;University of Athens
Venue:
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Year:
2010

Citing 17
Cited 10

A Heuristic Algorithm for the Set Covering Problem

Proceedings of the 5th International IPCO Conference on Integer Programming and Combinatorial Optimization
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Text mining for product attribute extraction

ACM SIGKDD Explorations Newsletter
Movie review mining and summarization

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Utility scoring of product reviews

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Show me the money!: deriving the pricing power of product features by mining consumer reviews

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Mining opinion features in customer reviews

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Automatically assessing review helpfulness

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Feature subsumption for opinion analysis

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Selecting a comprehensive set of reviews

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Toward a fair review-management system

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Estimating entity importance via counting set covers

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Selecting a characteristic set of reviews

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Fake reviews: the malicious perspective

NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
A framework for evaluating the smoothness of data-mining results

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Have you done anything like that?: predicting performance using inter-category reputation

Proceedings of the sixth ACM international conference on Web search and data mining
Using micro-reviews to select an efficient set of reviews

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Generating comparative summaries from reviews

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Parallel computation of skyline and reverse skyline queries using mapreduce

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given an extensive corpus of reviews on an item, a potential customer goes through the expressed opinions and collects information, in order to form an educated opinion and, ultimately, make a purchase decision. This task is often hindered by false reviews, that fail to capture the true quality of the item's attributes. These reviews may be based on insufficient information or may even be fraudulent, submitted to manipulate the item's reputation. In this paper, we formalize the Confident Search paradigm for review corpora. We then present a complete search framework which, given a set of item attributes, is able to efficiently search through a large corpus and select a compact set of high-quality reviews that accurately captures the overall consensus of the reviewers on the specified attributes. We also introduce CREST (Confident REview Search Tool), a user-friendly implementation of our framework and a valuable tool for any person dealing with large review corpora. The efficacy of our framework is demonstrated through a rigorous experimental evaluation.