Penalized solutions to functional regression problems

  • Authors:
  • Jaroslaw Harezlak;Brent A. Coull;Nan M. Laird;Shannon R. Magari;David C. Christiani

  • Affiliations:
  • Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA;Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA;Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA;Colden Corporation, 100 North 17th Street, 9th Floor, Philadelphia, PA 19103, USA;Department of Environmental Health, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.03

Visualization

Abstract

Recent technological advances in continuous biological monitoring and personal exposure assessment have led to the collection of subject-specific functional data. A primary goal in such studies is to assess the relationship between the functional predictors and the functional responses. The historical functional linear model (HFLM) can be used to model such dependencies of the response on the history of the predictor values. An estimation procedure for the regression coefficients that uses a variety of regularization techniques is proposed. An approximation of the regression surface relating the predictor to the outcome by a finite-dimensional basis expansion is used, followed by penalization of the coefficients of the neighboring basis functions by restricting the size of the coefficient differences to be small. Penalties based on the absolute values of the basis function coefficient differences (corresponding to the LASSO) and the squares of these differences (corresponding to the penalized spline methodology) are studied. The fits are compared using an extension of the Akaike Information Criterion that combines the error variance estimate, degrees of freedom of the fit and the norm of the basis function coefficients. The performance of the proposed methods is evaluated via simulations. The LASSO penalty applied to the linearly transformed coefficients yields sparser representations of the estimated regression surface, while the quadratic penalty provides solutions with the smallest L"2-norm of the basis function coefficients. Finally, the new estimation procedure is applied to the analysis of the effects of occupational particulate matter (PM) exposure on heart rate variability (HRV) in a cohort of boilermaker workers. Results suggest that the strongest association between PM exposure and HRV in these workers occurs as a result of point exposures to the increased levels of PM corresponding to smoking breaks.