Using classification trees to assess low birth weight outcomes

  • Authors:
  • Panagiota Kitsantas;Myles Hollander;Lei Li

  • Affiliations:
  • George Mason University, Department of Health Administration and Policy, The College of Health and Human Services, 4400 University Drive, Fairfax, VA 22030, USA;Florida State University, Department of Statistics, Tallahassee, FL 32306, USA;University of Southern California, Department of Biology and Mathematics, Los Angeles, CA 90089, USA

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Low birth weight (LBW) is a major public health problem. Compared to normal weight infants, LBW is positively associated with infant mortality and negatively associated with normative childhood cognitive and physical development. In the past two decades, research has identified important risk factors of LBW. In this study, we used classification trees to study the interactive nature of these factors. In particular we: (1) identify subgroups of women who are at a high risk of a LBW outcome in seven geographical regions of Florida, and (2) study the predictive performance of classification trees by comparing the tree-based results to those obtained using logistic regression. Methods: The data, 181,690 singleton births, were derived from Florida birth certificates recorded in 1998. Classification trees and logistic regression models were built based on seven geographical regions. The outcome variable consisted of two classes, namely LBW (=2500g) cases, while a large number of known risk factors was examined. Tree and logistic regression models were compared using Receiving Operating Curves, and sensitivity and specificity analyses. Results: The use of classification trees has revealed a number of high-risk subgroups. For instance, White, Hispanic or Other non-white mothers who were healthy and smoked with a weight gain less than 20lbs had a higher risk of a LBW birth compared to those with the same characteristics but with a weight gain of more than 20lbs. Factors such as parity and marital status were important predictors for pregnancy outcomes among nonsmoker White, Hispanic or Other non-white mothers. Furthermore, we found that Black mothers were directly classified as a high-risk subgroup in the regions of Panhandle, Northeast, North Central, while in the Southern regions a series of other characteristics further defined the high-risk subgroup of Black mothers. Overall, the differences in predictive performance between tree models and logistic regression were minimal. Conclusion: The present study demonstrated that classification trees can be used to identify high-risk subgroups of mothers who are at risk of LBW outcomes. Although these exploratory tree analyses revealed a number of distinctive variable interactions for each geographical area, the variable selection was similar across all seven regions. This study also demonstrated that classification trees did not outperform logistic regression models or vice versa; both approaches provided useful analyses of the data.