Blog or block: Detecting blog bots through behavioral biometrics

  • Authors:
  • Zi Chu;Steven Gianvecchio;Aaron Koehl;Haining Wang;Sushil Jajodia

  • Affiliations:
  • Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, USA;Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, USA;Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, USA;Department of Computer Science, The College of William and Mary, Williamsburg, VA 23187, USA;Center for Secure Information Systems, George Mason University, Fairfax, VA 22030, USA

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Blog bots are automated scripts or programs that post comments to blog sites, often including spam or other malicious links. An effective defense against the automatic form filling and posting from blog bots is to detect and validate the human presence. Conventional detection methods usually require direct participation of human users, such as recognizing a CAPTCHA image, which can be burdensome for users. In this paper, we present a new detection approach by using behavioral biometrics, primarily mouse and keystroke dynamics, to distinguish between human and bot. Based on passive monitoring, the proposed approach does not require any direct user participation. We collect real user input data from a very active online community and blog site, and use this data to characterize behavioral differences between human and bot. The most useful features for classification provide the basis for a detection system consisting of two main components: a webpage-embedded logger and a server-side classifier. The webpage-embedded logger records mouse movement and keystroke data while a user is filling out a form, and provides this data in batches to a server-side detector, which classifies the poster as human or bot. Our experimental results demonstrate an overall detection accuracy greater than 99%, with negligible overhead.