A method for managing access to web pages: filtering by statistical classification (FSC) applied to text

  • Authors:
  • Jonathan P. Caulkins;Wenxuan Ding;George Duncan;Ramayya Krishnan;Eric Nyberg

  • Affiliations:
  • The Heinz School, Carnegie Mellon University, Pittsburgh, PA;Department of Information and Decision Sciences, and of Computer Science, University of IIIinois, Chicago, IL;The Heinz School, Carnegie Mellon University, Pittsburgh, PA;The Heinz School, Carnegie Mellon University, Pittsburgh, PA;The School of Computer Science, Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • Decision Support Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Various entities (e.g., parents, employers) that provide users (e.g., children, employees) access to web content wish to limit the content accessed through those computers. Available filtering methods are crude in that they too often block "acceptable" content while failing to block "unacceptable" content. This paper presents a general and flexible classification method based on statistical techniques applied to text material, that we call, Filtering by Statistical Classification (FSC). According to each individual entity's expressed opinions about what content in a training data set is or is not acceptable, FSC constructs a customized model to represent each individual entity's preferences. FSC then uses this customized model to examine new web content and to block unwanted content. The empirical results suggest that our method has greater predictive power than do a variety of existing approaches.