Modern operating systems
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Bringing order to the Web: automatically categorizing search results
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Personalized web search by mapping user queries to categories
Proceedings of the eleventh international conference on Information and knowledge management
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The VLDB Journal — The International Journal on Very Large Data Bases
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Using manually-built web directories for automatic evaluation of known-item retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Hierarchical document categorization with support vector machines
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Using ODP metadata to personalize search
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Clustering versus faceted categories for information exploration
Communications of the ACM - Supporting exploratory search
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
STC+ and NM-STC: Two Novel Online Results Clustering Methods for Web Searching
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Exploratory web searching with dynamic taxonomies and results clustering
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
UPS: efficient privacy protection in personalized web search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A web 2.0 approach for organizing search results using wikipedia
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Journal of Web Engineering
Hi-index | 0.00 |
Organizing Web search results into hierarchical categories facilitates users' browsing through Web search results, especially for ambiguous queries where the potential results are mixed together. Previous methods on search result classification are usually based on pre-training a classification model on some fixed and shallow hierarchical categories, where only the top-two-level categories of a Web taxonomy is used. Such classification methods may be too coarse for users to browse, since most search results would be classified into only two or three shallow categories. Instead, a deep hierarchical classifier must provide many more categories. However, the performance of such classifiers is usually limited because their classification effectiveness can deteriorate rapidly at the third or fourth level of a hierarchy. In this paper, we propose a novel algorithm known as Deep Classifier to classify the search results into detailed hierarchical categories with higher effectiveness than previous approaches. Given the search results in response to a query, the algorithm first prunes a wide-ranged hierarchy into a narrow one with the help of some Web directories. Different strategies are proposed to select the training data by utilizing the hierarchical structures. Finally, a discriminative naíve Bayesian classifier is developed to perform efficient and effective classification. As a result, the algorithm can provide more meaningful and specific class labels for search result browsing than shallow style of classification. We conduct experiments to show that the Deep Classifier can achieve significant improvement over state-of-the-art algorithms. In addition, with sufficient off-line preparation, the efficiency of the proposed algorithm is suitable for online application