Batch-Mode Active Learning with Semi-supervised Cluster Tree for Text Classification
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.02 |
The World Wide Web has been growing rapidly in recent years, along with increasing needs for contentbased webpage filtering. But most existing filtering systems cannot easily satisfy the personalized filtering demands from different users at the same time. In this paper, a customizable instance-driven webpage filtering strategy is proposed. For different users, different webpage filters are produced by our system through mining the certain webpage classes they focus on. A semi-supervised learning (SSL) approach is applied for obtaining a precise description of the webpage class which a user wants to filter based on the small sized user instance set he or she provided. Subsequently, a feature selection step is performed and a Bayes classifier is created over the enlarged training set. Experimental results show the great stability and high performance of our proposed method, and it outperforms existing methods.