Elements of information theory
Elements of information theory
Selection and information: a class-based approach to lexical relationships
Selection and information: a class-based approach to lexical relationships
Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Making large-scale support vector machine learning practical
Advances in kernel methods
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Architecture of a metasearch engine that supports user information needs
Proceedings of the eighth international conference on Information and knowledge management
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A vector space model for automatic indexing
Communications of the ACM
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Data mining for association rules and sequential patterns: sequential and parallel algorithms
Data mining for association rules and sequential patterns: sequential and parallel algorithms
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Query clustering using content words and user feedback
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
Machine Learning
Information Retrieval: Algorithms and Heuristics
Information Retrieval: Algorithms and Heuristics
A critical examination of TDT's cost function
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The Perceptron Algorithm with Uneven Margins
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
ACM SIGIR Forum
U.S. versus European web searching trends
ACM SIGIR Forum
Query type classification for web document retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Coverage, relevance, and ranking: The impact of query operators on Web search engine results
ACM Transactions on Information Systems (TOIS)
Categorizing web queries according to geographical locality
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Operational requirements for scalable search systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
Hourly analysis of a very large topically categorized web query log
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A temporal comparison of AltaVista Web searching: Research Articles
Journal of the American Society for Information Science and Technology
Automatic web query classification using labeled and unlabeled training data
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Connecting topics in document collections with stepping stones and pathways
Proceedings of the 14th ACM international conference on Information and knowledge management
Generating better concept hierarchies using automatic document classification
Proceedings of the 14th ACM international conference on Information and knowledge management
Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences
Computational Linguistics
Improving Automatic Query Classification via Semi-Supervised Learning
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Q2C@UST: our winning solution to query classification in KDDCUP 2005
ACM SIGKDD Explorations Newsletter
The Ferrety algorithm for the KDD Cup 2005 problem
ACM SIGKDD Explorations Newsletter
Classifying search engine queries using the web as background knowledge
ACM SIGKDD Explorations Newsletter
Temporal analysis of a very large topically categorized Web query log
Journal of the American Society for Information Science and Technology
Web Search: Public Searching of the Web (Information Science and Knowledge Management)
Web Search: Public Searching of the Web (Information Science and Knowledge Management)
Analyzing the effect of query class on document retrieval performance
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Determining the informational, navigational, and transactional intent of Web queries
Information Processing and Management: an International Journal
Analysis of varying approaches to topical web query classification
Proceedings of the 3rd international conference on Scalable information systems
The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
ACM SIGSOFT Software Engineering Notes
Time series analysis of a Web search engine transaction log
Information Processing and Management: an International Journal
Classifying search queries using the Web as a source of knowledge
ACM Transactions on the Web (TWEB)
Quantifying Asymmetric Semantic Relations from Query Logs by Resource Allocation
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Sources of evidence for vertical selection
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
PQC: personalized query classification
Proceedings of the 18th ACM conference on Information and knowledge management
An analysis framework for search sequences
Proceedings of the 18th ACM conference on Information and knowledge management
Commercial Internet filters: Perils and opportunities
Decision Support Systems
Classifying web queries by topic and user intent
CHI '10 Extended Abstracts on Human Factors in Computing Systems
Classification-enhanced ranking
Proceedings of the 19th international conference on World wide web
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Inferring document utility via a decision-making based retrieval model
International Journal of Knowledge-based and Intelligent Engineering Systems
Mining Historic Query Trails to Label Long and Rare Search Engine Queries
ACM Transactions on the Web (TWEB)
Searchable web sites recommendation
Proceedings of the fourth ACM international conference on Web search and data mining
An approach to use query-related web context on document ranking
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Exploring wikipedia's category graph for query classification
AIS'11 Proceedings of the Second international conference on Autonomous and intelligent systems
Aggregated search result diversification
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Behavior-driven clustering of queries into topics
Proceedings of the 20th ACM international conference on Information and knowledge management
Which should we try first? ranking information resources through query classification
FQAS'11 Proceedings of the 9th international conference on Flexible Query Answering Systems
Data Mining and Knowledge Discovery
A feature-free search query classification approach using semantic distance
Expert Systems with Applications: An International Journal
Measuring website similarity using an entity-aware click graph
Proceedings of the 21st ACM international conference on Information and knowledge management
Mining search and browse logs for web search: A Survey
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Hi-index | 0.01 |
Accurate topical classification of user queries allows for increased effectiveness and efficiency in general-purpose Web search systems. Such classification becomes critical if the system must route queries to a subset of topic-specific and resource-constrained back-end databases. Successful query classification poses a challenging problem, as Web queries are short, thus providing few features. This feature sparseness, coupled with the constantly changing distribution and vocabulary of queries, hinders traditional text classification. We attack this problem by combining multiple classifiers, including exact lookup and partial matching in databases of manually classified frequent queries, linear models trained by supervised learning, and a novel approach based on mining selectional preferences from a large unlabeled query log. Our approach classifies queries without using external sources of information, such as online Web directories or the contents of retrieved pages, making it viable for use in demanding operational environments, such as large-scale Web search services. We evaluate our approach using a large sample of queries from an operational Web search engine and show that our combined method increases recall by nearly 40% over the best single method while maintaining adequate precision. Additionally, we compare our results to those from the 2005 KDD Cup and find that we perform competitively despite our operational restrictions. This suggests it is possible to topically classify a significant portion of the query stream without requiring external sources of information, allowing for deployment in operationally restricted environments.