A study on combination of block importance and relevance to estimate page relevance
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A comparison of implicit and explicit links for web page classification
Proceedings of the 15th international conference on World Wide Web
Learning query intent from regularized click graphs
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Iterative Reinforcement Cross-Domain Text Classification
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Exploring social tagging graph for web object classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning with click graph for query intent classification
ACM Transactions on Information Systems (TOIS)
Learning search tasks in queries and web pages via graph regularization
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A novel web page categorization algorithm based on block propagation using query-log information
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Hi-index | 0.01 |
Most existing categorization algorithms deal with homogeneous Web data objects, and consider interrelated objects as additional features when taking the interrelationships withother types of objects into account. However, focusing on any single aspects of these interrelationships and objects will not fully reveal their true categories. In this paper, wepropose a novel categorization algorithm, the Iterative Reinforcement Categorization algorithm (IRC), to exploit the full interrelationships between the heterogeneous objects on the Web.IRC attempts to classify the interrelated Web objects by iterative reinforcement between individual classification results of different types via the interrelationships. Experiments on a clickthrough log dataset from MSN search engine show that, with the F1 measures, IRC achieves a 26.4% improvement over a pure content-based classification method, a 21% improvement over a query metadata-based method, and a 16.4% improvement over a virtual document-based method. Furthermore, our experiments show that IRC converges rapidly.