Fraud detection in reputation systems in e-markets using logistic regression and stepwise optimization

  • Authors:
  • Rafael Maranzato;Adriano Pereira;Marden Neubert;Alair Pereira do Lago

  • Affiliations:
  • Universo Online Inc., São Paulo, SP, Brazil;Federal Center for Technological Education of Minas Gerais (CEFET-MG), Belo Horizonte, MG, Brazil;Universo Online Inc., São Paulo, SP, Brazil;Univ. of São Paulo - USP, São Paulo, SP Brazil

  • Venue:
  • ACM SIGAPP Applied Computing Review
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reputation is the opinion of the public toward a person, a group of people, or an organization. Reputation systems are particularly important in e-markets, where they help buyers to decide whether to purchase a product or not. Since a higher reputation means more profit, some users try to deceive such systems to increase their reputation. E-markets should protect their reputation systems from attacks in order to maintain a sound environment. This work addresses the task of finding attempts to deceive reputation systems in e-markets. Our goal is to generate a list of users (sellers) ranked by the probability of fraud. Firstly we describe characteristics related to transactions that may indicate frauds evidence and they are expanded to the sellers. We describe results of a simple approach that ranks sellers by counting characteristics of fraud. Then we incorporate characteristics that cannot be used by the counting approach, and we apply logistic regression to both, improved and not improved. We use real data from a large Brazilian e-market to train and evaluate our methods and the improved set with logistic regression performs better, specially when we apply stepwise optimization. We validate our results with specialists of fraud detection in this market place. In the end, we increase by 112% the number of identified fraudsters against the reputation system. In terms of ranking, we reach 93% of average precision after specialists' review in the list that uses Logistic Regression and Stepwise optimization. We also detect 55% of fraudsters with a precision of 100%.