Learning random forests for ranking

Authors:
Liangxiao Jiang
Affiliations:
Department of Computer Science, China University of Geosciences, Wuhan, China 430074
Venue:
Frontiers of Computer Science in China
Year:
2011

Citing 19
Cited 1

A Distance-Based Attribute Selection Measure for Decision Tree Induction

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
Machine Learning

Machine Learning
Random Forests

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Induction of Decision Trees

Machine Learning
Pruning Decision Trees with Misclassification Costs

ECML '98 Proceedings of the 10th European Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Tree Induction for Probability-Based Ranking

Machine Learning
Inference for the Generalization Error

Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
An Improved Attribute Selection Measure for Decision Tree Induction

FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 04
Learning decision tree for ranking

Knowledge and Information Systems
AUC: a statistically consistent and more discriminating measure than accuracy

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Scaling Up the Accuracy of Bayesian Network Classifiers by M-Estimate

ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

Editorial: Modifications of the construction and voting mechanisms of the Random Forests Algorithm

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The random forests (RF) algorithm, which combines the predictions from an ensemble of random trees, has achieved significant improvements in terms of classification accuracy. In many real-world applications, however, ranking is often required in order to make optimal decisions. Thus, we focus our attention on the ranking performance of RF in this paper. Our experimental results based on the entire 36 UC Irvine Machine Learning Repository (UCI) data sets published on the main website of Weka platform show that RF doesn't perform well in ranking, and is even about the same as a single C4.4 tree. This fact raises the question of whether several improvements to RF can scale up its ranking performance. To answer this question, we single out an improved random forests (IRF) algorithm. Instead of the information gain measure and the maximum-likelihood estimate, the average gain measure and the similarity-weighted estimate are used in IRF. Our experiments show that IRF significantly outperforms all the other algorithms used to compare in terms of ranking while maintains the high classification accuracy characterizing RF.