An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
ACM SIGIR Forum
ScentTrails: Integrating browsing and searching on the Web
ACM Transactions on Computer-Human Interaction (TOCHI)
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Findex: search result categories help users when document ranking fails
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
HLT '02 Proceedings of the second international conference on Human Language Technology Research
The class imbalance problem: A systematic study
Intelligent Data Analysis
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Decision trees for hierarchical multi-label classification
Machine Learning
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Survey and evaluation of query intent detection methods
Proceedings of the 2009 workshop on Web Search Click Data
A survey of hierarchical classification across different application domains
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
The original Yahoo! search engine consists of manually organized topic hierarchy of webpages for easy browsing. Modern search engines (such as Google and Bing), on the other hand, return a flat list of webpages based on keywords. It would be ideal if hierarchical browsing and keyword search can be seamlessly combined. The main difficulty in doing so is to automatically (i.e., not manually) classify and rank a massive number of webpages into various hierarchies (such as topics, media types, regions of the world). In this paper we report our attempt towards building this integrated search engine, called SEE (Search Engine with hiErarchy). We implement a hierarchical classification system based on SupportVector Machines, and embed it in SEE. We also design a novel user interface that allows users to dynamically adjust their desire for a higher accuracy vs. more results in any (sub)category of the hierarchy. Though our current search engine is still small (indexing about 1.2 million webpages), the results, including a small user study, have shown a great promise for integrating such techniques in the next-generation search engine.