An efficient topic-specific web text filtering framework

  • Authors:
  • Qiang Li;Jianhua Li

  • Affiliations:
  • Modern Communication Institute, Shanghai Jiaotong univ., Shanghai, P.R. China;Modern Communication Institute, Shanghai Jiaotong univ., Shanghai, P.R. China

  • Venue:
  • APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
  • Year:
  • 2005

Quantified Score

Hi-index 0.03

Visualization

Abstract

In this paper, an efficient topic-specific Web text filtering framework is proposed. This framework focuses on blocking some topic-specific Web text content. In this framework, a hybrid feature selection method is proposed, and a high efficient filtering engine is designed. In training, we select features based on CHI statistic and rough set theory, then to construct filter with Vector Space Model. We train our frame with huge datasets, and the result suggests our framework is more effective for the topic-specific text filtering. This framework runs at server such as gateway, and it is more efficient than a client-based system.