Varying approaches to topical web query classification

Authors:
Steven M. Beitzel;Eric C. Jensen;Abdur Chowdhury;Ophir Frieder
Affiliations:
Telcordia Technologies, Piscataway, NJ;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL
Venue:
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2007

Citing 3
Cited 15

Improving Automatic Query Classification via Semi-Supervised Learning

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
KDD CUP-2005 report: facing a great challenge

ACM SIGKDD Explorations Newsletter
Building bridges for web query classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Query dependent ranking using K-nearest neighbor

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning query intent from regularized click graphs

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Survey and evaluation of query intent detection methods

Proceedings of the 2009 workshop on Web Search Click Data
Empirical exploitation of click data for task specific ranking

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Ranking with query-dependent loss for web search

Proceedings of the third ACM international conference on Web search and data mining
Classifying web queries by topic and user intent

CHI '10 Extended Abstracts on Human Factors in Computing Systems
Ranking specialization for web search: a divide-and-conquer approach by using topical RankSVM

Proceedings of the 19th international conference on World wide web
Learning with click graph for query intent classification

ACM Transactions on Information Systems (TOIS)
Optimizing unified loss for web ranking specialization

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Identifying and ranking possible semantic and common usage categories of search engine queries

WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Real time search on the web: Queries, topics, and economic value

Information Processing and Management: an International Journal
Query classification based on index association rule expansion

WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
An evaluation of classification models for question topic categorization

Journal of the American Society for Information Science and Technology
Mining query subtopics from search log data

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
How fresh do you want your search results?

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Topical classification of web queries has drawn recent interest because of the promise it offers in improving retrieval effectiveness and efficiency. However, much of this promise depends on whether classification is performed before or after the query is used to retrieve documents. We examine two previously unaddressed issues in query classification: pre versus post-retrieval classification effectiveness and the effect of training explicitly from classified queries versus bridging a classifier trained using a document taxonomy. Bridging classifiers map the categories of a document taxonomy onto those of a query classification problem to provide sufficient training data. We find that training classifiers explicitly from manually classified queries outperforms the bridged classifier by 48% in F1 score. Also, a pre-retrieval classifier using only the query terms performs merely 11% worse than the bridged classifier which requires snippets from retrieved documents.