An efficient approach for mining web content sensitivity

  • Authors:
  • Cheng Wang;Ying Liu;Liheng Jian;Peng Zhang

  • Affiliations:
  • Agilent Technologies Co. Ltd., Beijing 100102, China.;Graduate University of Chinese Academy of Sciences/ Fictitious Economy and Data Science Research Center, Beijing 100080, China.;Graduate University of Chinese Academy of Sciences.;Graduate University of Chinese Academy of Sciences

  • Venue:
  • International Journal of Knowledge and Web Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abnormal remarks on the web, such as violence, threat, superstition, etc., may disturb the social order and public morality (referred as sensitive content). To provide a quantitative measure of the sensitivity of a webpage, we propose the concept of web content sensitivity which measures how sensitive a page is. We also propose a web content sensitivity mining approach. Our experiment identified a number of sensitive webpages that traditional frequency-based methods failed to find. By varying the sensitive values of the keywords, different sets of high sensitivity keywords were discovered as well as the corresponding webpages.