Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
ACM SIGIR Forum
Categorizing web queries according to geographical locality
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
IRC: An Iterative Reinforcement Categorization Algorithm for Interrelated Web Objects
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Automatic identification of user goals in Web search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Improving Automatic Query Classification via Semi-Supervised Learning
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Building bridges for web query classification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
Robust classification of rare queries using web knowledge
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Random walks on the click graph
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Varying approaches to topical web query classification
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Regularized query classification using search click information
Pattern Recognition
Learning query intent from regularized click graphs
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
When recommendation meets mobile: contextual and personalized recommendation on the go
Proceedings of the 13th international conference on Ubiquitous computing
Query classification based on index association rule expansion
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
A multi-faceted approach to query intent classification
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Efficient parsing-based search over structured data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Topical query classification, as one step toward understanding users' search intent, is gaining increasing attention in information retrieval. Previous works on this subject primarily focused on enrichment of query features, for example, by augmenting queries with search engine results. In this work, we investigate a completely orthogonal approach—instead of improving feature representation, we aim at drastically increasing the amount of training data. To this end, we propose two semisupervised learning methods that exploit user click-through data. In one approach, we infer class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph; and then use these automatically labeled queries to train classifiers using query terms as features. In a second approach, click graph learning and query classifier training are conducted jointly with an integrated objective. Our methods are evaluated in two applications, product intent and job intent classification. In both cases, we expand the training data by over two orders of magnitude, leading to significant improvements in classification performance. An additional finding is that with a large amount of training data obtained in this fashion, a classifier based on simple query term features can outperform those using state-of-the-art, augmented features.