A Utility-Based Web Content Sensitivity Mining Approach

  • Authors:
  • Cheng Wang;Ying Liu;Liheng Jian;Peng Zhang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abnormal remarks on World Wide Web, such as violence, threat, superstition, etc. may disturb the social order and public morality. Most traditional methods filter a page as long as it contains a keyword in a predefined blacklist. Such methods cannot provide a quantitative measure of how sensitive the content is. In this paper, we propose a utility-based Web content sensitivity mining approach. Utility is viewed as the measure of how sensitive a page is. It allows the Internet regulators to take different operations according to different sensitivity values. We apply our approach on a real-world Web dataset. It identified a number of sensitive Web pages that traditional frequency-based methods failed to find. By varying the sensitive values of the keywords, different sets of high sensitivity keywords were discovered.