Machine Learning
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Enriching web taxonomies through subject categorization of query terms from search engine logs
Decision Support Systems - Web retrieval and mining
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Learning Probabilistic Models of Relational Structure
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Discovering Test Set Regularities in Relational Domains
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Journal of the American Society for Information Science and Technology
Query Expansion by Mining User Logs
IEEE Transactions on Knowledge and Data Engineering
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Knowledge Discovery in Multiple Databases
Knowledge Discovery in Multiple Databases
Mining Multiple Data Sources: Local Pattern Analysis
Data Mining and Knowledge Discovery
Using the wisdom of the crowds for keyword generation
Proceedings of the 17th international conference on World Wide Web
Hidden sentiment association in chinese web opinion mining
Proceedings of the 17th international conference on World Wide Web
Query-log mining for detecting spam
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Homophily of Neighborhood in Graph Relational Classifier
SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Browse with a social web directory
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Existing categorization algorithms deal with homogeneous Web objects, and consider interrelated objects as additional features when taking the interrelationships with other types of objects into account. However, focusing on any single aspect of the inter-object relationship is not sufficient to fully reveal the true categories of Web objects. In this paper, we propose a novel categorization algorithm, called the Iterative Reinforcement Categorization Algorithm (IRC), to exploit the full interrelationship between different types of Web objects on the Web, including Web pages and queries. IRC classifies the interrelated Web objects by iteratively reinforcing the individual classification results of different types of objects via their interrelationship. Experiments on a clickthrough-log dataset from the MSN search engine show that, in terms of the F1 measure, IRC achieves a 26.4% improvement over a pure content-based classification method. It also achieves a 21% improvement over a query-metadata-based method, as well as a 16.4% improvement on F1 measure over the well-known virtual document-based method. Our experiments show that IRC converges fast enough to be applicable to real world applications.