Synthetic review spamming and defense

  • Authors:
  • Alex Morales;Huan Sun;Xifeng Yan

  • Affiliations:
  • University of California, Santa Barbara, Santa Barbara, USA;University of California, Santa Barbara, Santa Barbara, USA;University of California, Santa Barbara, Santa Barbara, USA

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Online reviews are widely adopted in many websites such as Amazon, Yelp, and TripAdvisor. Positive reviews can bring significant financial gains, while negative ones often cause sales loss. This fact, unfortunately, results in strong incentives for opinion spam to mislead readers. Instead of hiring humans to write deceptive reviews, in this work, we bring into attention an automated, low-cost process for generating fake reviews, variations of which could be easily employed by evil attackers in reality. To the best of our knowledge, we are the first to expose the potential risk of machine-generated deceptive reviews. Our simple review synthesis model uses one truthful review as a template, and replaces its sentences with those from other reviews in a repository. The fake reviews generated by this mechanism are extremely hard to detect: Both the state-of-the-art machine detectors and human readers have an error rate of 35%-48%. A novel defense method that leverages the difference of semantic flows between fake and truthful reviews is developed, reducing the detection error rate to approximately 22%. Nevertheless, it is still a challenging research task to further decrease the error rate.