Feature selection for ranking using boosted trees

Authors:
Feng Pan;Tim Converse;David Ahn;Franco Salvetti;Gianluca Donato
Affiliations:
Microsoft / Powerset, San Francisco, CA, USA;Microsoft / Powerset, San Francisco, CA, USA;Microsoft / Powerset, San Francisco, CA, USA;Microsoft / Powerset, San Francisco, CA, USA;Microsoft / Powerset, San Francisco, CA, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 7
Cited 4

An introduction to variable and feature selection

The Journal of Machine Learning Research
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection for ranking

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Genetic-based search for error-correcting graph isomorphism

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Hierarchical feature selection for ranking

Proceedings of the 19th international conference on World wide web
Orientation distance-based discriminative feature extraction for multi-class classification

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Ordinal regularized manifold feature extraction for image ranking

Signal Processing
Feature engineering for semantic place prediction

Pervasive and Mobile Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern search engines have to be fast to satisfy users, so there are hard back-end latency requirements. The set of features useful for search ranking functions, though, continues to grow, making feature computation a latency bottleneck. As a result, not all available features can be used for ranking, and in fact, much of the time, only a small percentage of these features can be used. Thus, it is crucial to have a feature selection mechanism that can find a subset of features that both meets latency requirements and achieves high relevance. To this end, we explore different feature selection methods using boosted regression trees, including both greedy approaches (selecting the features with highest relative importance as computed by boosted trees; discounting importance by feature similarity and a randomized approach. We evaluate and compare these approaches using data from a commercial search engine. The experimental results show that the proposed randomized feature selection with feature-importance-based backward elimination outperforms greedy approaches and achieves a comparable relevance with 30 features to a full-feature model trained with 419 features and the same modeling parameters.