A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Query type classification for web document retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Bootstrapping for hierarchical document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Categorizing web queries according to geographical locality
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Automatic text categorization by unsupervised learning
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic web query classification using labeled and unlabeled training data
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
On the use of linear programming for unsupervised text classification
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Improving Automatic Query Classification via Semi-Supervised Learning
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Detecting online commercial intention (OCI)
Proceedings of the 15th international conference on World Wide Web
Query enrichment for web-query classification
ACM Transactions on Information Systems (TOIS)
Investigating unsupervised learning for text categorization bootstrapping
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Using Google distance to weight approximate ontology matches
Proceedings of the 16th international conference on World Wide Web
Robust classification of rare queries using web knowledge
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Mining User preference using Spy voting for search engine personalization
ACM Transactions on Internet Technology (TOIT)
Determining the informational, navigational, and transactional intent of Web queries
Information Processing and Management: an International Journal
Evaluation of Multilingual and Multi-modal Information Retrieval
Analysis of varying approaches to topical web query classification
Proceedings of the 3rd international conference on Scalable information systems
Unsupervised query categorization using automatically-built concept graphs
Proceedings of the 18th international conference on World wide web
Understanding user's query intent with wikipedia
Proceedings of the 18th international conference on World wide web
Discovering users' specific geo intention in web search
Proceedings of the 18th international conference on World wide web
An online blog reading system by topic clustering and personalized ranking
ACM Transactions on Internet Technology (TOIT)
Context-aware query classification
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
On strategies for imbalanced text classification using SVM: A comparative study
Decision Support Systems
Privacy-preserving similarity-based text retrieval
ACM Transactions on Internet Technology (TOIT)
Precomputing search features for fast and accurate query classification
Proceedings of the third ACM international conference on Web search and data mining
Semi-supervised document classification with a mislabeling error model
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Hi-index | 0.00 |
Yellow pages search is a popular service that provides a means for finding businesses close to particular locations. The efficient search of yellow pages is becoming a rapidly evolving research area. The underlying data maintained in yellow pages search engines are typically labeled according to Standard Industry Classification (SIC) categories, and users can search yellow pages with categories according to their interests. Categorizing yellow pages queries into a subset of topical categories can help to improve search experience and quality. However, yellow pages queries are usually short and ambiguous. In addition, a yellow pages query taxonomy is typically organized by a hierarchy of a fairly large number of categories. These characteristics make automatic yellow pages query categorization difficult and challenging. In this article, we propose a flexible yellow pages query categorization approach. The proposed technique is built based on a TF-IDF similarity taxonomy matching scheme that is able to provide more accurate query categorization than previous keyword-based matching schemes. To further improve the categorization performance, we design several filtering schemes. Through extensive experimentation, we demonstrate encouraging results. We obtain F1 measures of about 0.5 and 0.3 for categorizing yellow pages queries into 19 coarse categories and 244 finer categories, respectively. We investigate different components in the proposed approach and also demonstrate the superiority of our approach over a hierarchical support vector machine classifier.