Stochastic gradient boosting

Authors:
Jerome H. Friedman
Affiliations:
Department of Statistics and Stanford Linear Accelerator Center, Stanford University, Stanford, CA
Venue:
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Year:
2002

Citing 1
Cited 108

Bagging predictors

Machine Learning

Boosting with Noisy Data: Some Views from Statistical Theory

Neural Computation
Embedded predictive modeling in a parallel relational database

Proceedings of the 2006 ACM symposium on Applied computing
Coupling feature selection and machine learning methods for navigational query identification

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Privacy-preserving boosting

Data Mining and Knowledge Discovery
Nonparametric Modeling of Neural Point Processes via Stochastic Gradient Boosting Regression

Neural Computation
Gradient boosting for kernelized output spaces

Proceedings of the 24th international conference on Machine learning
A local boosting algorithm for solving classification problems

Computational Statistics & Data Analysis
Finding high-quality content in social media

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Reconfigurable computing for learning Bayesian networks

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
An efficient modified boosting method for solving classification problems

Journal of Computational and Applied Mathematics
Software reliability prediction by soft computing techniques

Journal of Systems and Software
Phone duration modeling using gradient tree boosting

Speech Communication
Learning to rank with ties

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Additive Groves of Regression Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Solving Regression by Learning an Ensemble of Decision Rules

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Bootstrap estimated true and false positive rates and ROC curve

Computational Statistics & Data Analysis
Handling class imbalance in customer churn prediction

Expert Systems with Applications: An International Journal
Predicting the readability of short web summaries

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Automated extraction of expert knowledge in analog topology selection and sizing

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Importance sampled circuit learning ensembles for robust analog IC design

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Improved estimation of software project effort using multiple additive regression trees

Expert Systems with Applications: An International Journal
An image-based automatic Arabic translation system

Pattern Recognition
Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography

Computational Statistics & Data Analysis
Multi-task learning for learning to rank in web search

Proceedings of the 18th ACM conference on Information and knowledge management
Web search result summarization: title selection algorithms and user satisfaction

Proceedings of the 18th ACM conference on Information and knowledge management
Stochastic gradient boosted distributed decision trees

Proceedings of the 18th ACM conference on Information and knowledge management
Optimization of temporal processes: a model predictive control approach

IEEE Transactions on Evolutionary Computation - Special issue on computational finance and economics
Accurate and efficient processor performance prediction via regression tree based modeling

Journal of Systems Architecture: the EUROMICRO Journal
Exploratory undersampling for class-imbalance learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Variation-aware structural synthesis of analog circuits via hierarchical building blocks and structural homotopy

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Improving web search relevance with semantic features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A data mining approach to strategy prediction

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Globally reliable variation-aware sizing of analog integrated circuits via response surfaces and structural homotopy

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Bagging different instead of similar models for regression and classification problems

International Journal of Computer Applications in Technology
The Bayesian Additive Classification Tree applied to credit risk modelling

Computational Statistics & Data Analysis
Web-services classification using intelligent techniques

Expert Systems with Applications: An International Journal
Two bagging algorithms with coupled learners to encourage diversity

IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
A framework for modeling positive class expansion with single snapshot

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Multiobjective optimization of temporal processes

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on game theory
Learning the click-through rate for rare/new ads from similar ads

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Combining predictions for accurate recommender systems

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to combine discriminative classifiers: confidence based

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Location disambiguation in local searches using gradient boosted decision trees

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
Contact-state classification in human-demonstrated robot compliant motion tasks using the boosting algorithm

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A low variance error boosting algorithm

Applied Intelligence
Resolving surface forms to Wikipedia topics

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Sensitivity analysis for complex ecological models - A new approach

Environmental Modelling & Software
Boundary detection using f-measure-, filter- and feature- (F3) boost

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Improving phone duration modelling using support vector regression fusion

Speech Communication
A comparative study on the performance of several ensemble methods with low subsampling ratio

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
Adaptive bootstrapping of recommender systems using decision trees

Proceedings of the fourth ACM international conference on Web search and data mining
On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals

Proceedings of the fourth ACM international conference on Web search and data mining
A study on the impact of product images on user clicks for online shopping

Proceedings of the 20th international conference companion on World wide web
Municipal revenue prediction by ensembles of neural networks and support vector machines

WSEAS Transactions on Computers
Municipal revenue prediction by support vector machine ensembles

ICCOMP'10 Proceedings of the 14th WSEAS international conference on Computers: part of the 14th WSEAS CSCC multiconference - Volume I
Stochastic boosting algorithms

Statistics and Computing
Bid landscape forecasting in online ad exchange marketplace

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
I want to answer; who has a question?: Yahoo! answers recommender system

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Modeling the temperature of hot rolled steel plate with semi-supervised learning methods

DS'11 Proceedings of the 14th international conference on Discovery science
Location-aware click prediction in mobile local search

Proceedings of the 20th ACM international conference on Information and knowledge management
Predicting document effectiveness in pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
Re-mining item associations: Methodology and a case study in apparel retailing

Decision Support Systems
Gradient boosting trees for auto insurance loss cost modeling and prediction

Expert Systems with Applications: An International Journal
An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Expert Systems with Applications: An International Journal
Local additive regression of decision stumps

SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Intrusion detection based on behavior mining and machine learning techniques

IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Feature selection for improved phone duration modeling of greek emotional speech

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
PocketWeb: instant web browsing for mobile devices

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Two-stage phone duration modelling with feature construction and feature vector extension for the needs of speech synthesis

Computer Speech and Language
Online modeling of proactive moderation system for auction fraud detection

Proceedings of the 21st international conference on World Wide Web
Analyzing and predicting question quality in community question answering services

Proceedings of the 21st international conference companion on World Wide Web
Save the best for last? The treatment of dominant predictors in financial forecasting

Expert Systems with Applications: An International Journal
Intelligible models for classification and regression

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Heterogeneous ensemble for feature drifts in data streams

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
An exploration of ranking heuristics in mobile local search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Automatic refinement of patent queries using concept importance predictors

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Prefetching query results and its impact on search engines

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Sequential manifold learning for efficient churn prediction

Expert Systems with Applications: An International Journal
Using boosted trees for click-through rate prediction for sponsored search

Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy
Dynamic ad layout revenue optimization for display advertising

Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy
Learning-Based pseudo-relevance feedback for patent retrieval

IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Ensemble approaches for regression: A survey

ACM Computing Surveys (CSUR)
Recipe recommendation using ingredient networks

Proceedings of the 3rd Annual ACM Web Science Conference
Machine Learning Methods For Detecting Patterns Of Management Fraud

Computational Intelligence
Hybrid intelligent systems for predicting software reliability

Applied Soft Computing
Peer-to-peer multi-class boosting

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Contextual object detection using set-based classification

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Can social features help learning to rank youtube videos?

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Application of Machine Learning Techniques to Predict Software Reliability

International Journal of Applied Evolutionary Computation
Bagging ensemble selection for regression

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
High performance concrete compressive strength forecasting using ensemble models based on discrete wavelet transform

Engineering Applications of Artificial Intelligence
Prediction of forest aboveground biomass: an exercise on avoiding overfitting

EvoApplications'13 Proceedings of the 16th European conference on Applications of Evolutionary Computation
Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Studying page life patterns in dynamical web

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Web usage mining with semantic analysis

Proceedings of the 22nd international conference on World Wide Web
Sensing the pulse of urban refueling behavior

Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing
Personalization of web-search using short-term browsing context

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Entity-centric document filtering: boosting feature mapping through meta-features

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Fit or unfit: analysis and prediction of 'closed questions' on stack overflow

Proceedings of the first ACM conference on Online social networks
Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013

Proceedings of the 2013 KDD Cup 2013 Workshop
Contextual rule-based feature engineering for author-paper identification

Proceedings of the 2013 KDD Cup 2013 Workshop
Towards predicting query execution time for concurrent and dynamic database workloads

Proceedings of the VLDB Endowment
A causal inference approach to measure price elasticity in Automobile Insurance

Expert Systems with Applications: An International Journal
An efficient framework for online advertising effectiveness measurement and comparison

Proceedings of the 7th ACM international conference on Web search and data mining
Where to go from here? Mobility prediction from instantaneous information

Pervasive and Mobile Computing
How to Improve Your Search Engine Ranking: Myths and Reality

ACM Transactions on the Web (TWEB)
Detecting the impact area of BP deepwater horizon oil discharge: an analysis by time varying coefficient logistic models and boosted trees

Computational Statistics
Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Machine Learning

Quantified Score

Hi-index	0.01

Visualization

Abstract

Gradient boosting constructs additive regression models by sequentially fitting a simple parameterized function (base learner) to current "pseudo'-residuals by least squares at each iteration. The pseudo-residuals are the gradient of the loss functional being minimized, with respect to the model values at each training data point evaluated at the current step. It is shown that both the approximation accuracy and execution speed of gradient boosting can be substantially improved by incorporating randomization into the procedure. Specifically, at each iteration a subsample of the training data is drawn at random (without replacement) from the full training data set. This randomly selected subsample is then used in place of the full sample to fit the base learner and compute the model update for the current iteration. This randomized approach also increases robustness against overcapacity of the base learner.