Time series classification for the prediction of dialysis in critically ill patients using echo statenetworks

  • Authors:
  • Femke Ongenae;Stijn Van Looy;David Verstraeten;Thierry Verplancke;Dominique Benoit;Filip De Turck;Tom Dhaene;Benjamin Schrauwen;Johan Decruyenaere

  • Affiliations:
  • Department of Information Technology (INTEC), Ghent University - Interdisciplinary Institute for Broadband Technology (IBBT), Gaston Crommenlaan 8bus 201, B-9050 Ghent, Belgium;Department of Environmental Modelling, Flemish Institute for Technological Research (VITO), Boeretang 200, 2400 Mol, Belgium;Department of Electronics and Information Systems (ELIS), Ghent University, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium;Department of Intensive Care Medicine, Ghent University Hospital, De Pintelaan 185 - 2K12 IC, B-9000 Ghent, Belgium;Department of Intensive Care Medicine, Ghent University Hospital, De Pintelaan 185 - 2K12 IC, B-9000 Ghent, Belgium;Department of Information Technology (INTEC), Ghent University - Interdisciplinary Institute for Broadband Technology (IBBT), Gaston Crommenlaan 8bus 201, B-9050 Ghent, Belgium;Department of Information Technology (INTEC), Ghent University - Interdisciplinary Institute for Broadband Technology (IBBT), Gaston Crommenlaan 8bus 201, B-9050 Ghent, Belgium;Department of Electronics and Information Systems (ELIS), Ghent University, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium;Department of Intensive Care Medicine, Ghent University Hospital, De Pintelaan 185 - 2K12 IC, B-9000 Ghent, Belgium

  • Venue:
  • Engineering Applications of Artificial Intelligence
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Time series often appear in medical databases, but only few machine learning methods exist that process this kind of data properly. Most modeling techniques have been designed with a static data model in mind and are not suitable for coping with the dynamic nature of time series. Recurrent neural networks (RNNs) are often used to process time series, but only a few training algorithms exist for RNNs which are complex and often yield poor results. Therefore, researchers often turn to traditional machine learning approaches, such as support vector machines (SVMs), which can easily be set up and trained and combine them with feature extraction (FE) and selection (FS) to process the high-dimensional temporal data. Recently, a new approach, called echo state networks (ESNs), has been developed to simplify the training process of RNNs. This approach allows modeling the dynamics of a system based on time series data in a straightforwardway. The objective of this study is to explore the advantages of using ESN instead of other traditional classifiers combined with FE and FS in classification problems in the intensive care unit (ICU) when the input data consists of time series. While ESNs have mostly been used to predict the future course of a time series, we use the ESN model for classification instead. Although time series often appear in medical data, little medical applications of ESNs have been studiedyet. Methods and material: ESN is used to predict the need for dialysis between the fifth and tenth day after admission in the ICU. The input time series consist of measured diuresis and creatinine values during the first 3days after admission. Data about 830 patients was used for the study, of which 82 needed dialysis between the fifth and tenth day after admission. ESN is compared to traditional classifiers, a sophisticated and a simple one, namely support vector machines and the naive Bayes (NB) classifier. Prior to the use of the SVM and NB classifier, FE and FS is required to reduce the number of input features and thus alleviate the curse dimensionality. Extensive feature extraction was applied to capture both the overall properties of the time series and the correlation between the different measurements in the time series. The feature selection method consists of a greedy hybrid filter-wrapper method using a NB classifier, which selects in each iteration the feature that improves prediction the best and shows little multicollinearity with the already selected set. Least squares regression with noise was used to train the linear readout function of the ESN to mitigate sensitivity to noise and overfitting. Fisher labeling was used to deal with the unbalanced data set. Parameter sweeps were performed to determine the optimal parameter values for the different classifiers. The area under the curve (AUC) and maximum balanced accuracy are used as performance measures. The required execution time was also measured. Results: The classification performance of the ESN shows significant difference at the 5% level compared to the performance of the SVM or the NB classifier combined with FE and FS. The NB+FE+FS, with an average AUC of 0.874, has the best classification performance. This classifier is followed by the ESN, which has an average AUC of 0.849. The SVM+FE+FS has the worst performance with an average AUC of 0.838. The computation time needed to pre-process the data and to train and test the classifier is significantly less for the ESN compared to the SVM andNB. Conclusion: It can be concluded that the use of ESN has an added value in predicting the need for dialysis through the analysis of time series data. The ESN requires significantly less processing time, needs no domain knowledge, is easy to implement, and can be configured using rules ofthumb.