Ensembling Regression Models To Improve Their Predictivity: A Case Study In Qsar (Quantitative Structure Activity Relationships) With Computational Chemometrics

  • Authors:
  • Giuseppina Gini;Tushar Garg;Marco Stefanelli

  • Affiliations:
  • Dipartimento di Elettronica ed Informazione, Politecnico di Milano, Milan, Italy;Dipartimento di Elettronica ed Informazione, Politecnico di Milano, Milan, Italy,Indian Institute of Technology, Guwahati, India;Dipartimento di Elettronica ed Informazione, Politecnico di Milano, Milan, Italy

  • Venue:
  • Applied Artificial Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The last several years have seen an increasing emphasis on mathematical models, both based on statistics and on machine-learning. Today Bayesian nets, neural nets, support vector machines (SVM), and induction trees, are commonly used in the analysis of scientific data. Moreover, a recent emphasis in the modelling community is on improving the performance of classifiers through ensembling more different and accurate models in order to reduce the prediction error. Ensembling in fact is a way of taking advantage of good models that make errors in different parts of the data space. We will outline the developments in model construction and evaluation through those techniques justify their use and propose some quantitative structure activity relationships (QSAR) and models based on ensembling. The models presented here are in the area of predicting acute toxicity for the purpose of regulatory systems. The emphasis is on the better performances of ensembles, since the general goal of delivering usable QSAR models requires others that are out of the scope of this article.