A vector space model for automatic indexing
Communications of the ACM
High-performing feature selection for text classification
Proceedings of the eleventh international conference on Information and knowledge management
Neural Networks for Web Content Filtering
IEEE Intelligent Systems
An Efficient Algorithm for Matching Multiple Patterns
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.03 |
In this paper, an efficient topic-specific Web text filtering framework is proposed. This framework focuses on blocking some topic-specific Web text content. In this framework, a hybrid feature selection method is proposed, and a high efficient filtering engine is designed. In training, we select features based on CHI statistic and rough set theory, then to construct filter with Vector Space Model. We train our frame with huge datasets, and the result suggests our framework is more effective for the topic-specific text filtering. This framework runs at server such as gateway, and it is more efficient than a client-based system.