Unsupervised Similarity-Based Risk Stratification for Cardiovascular Events Using Long-Term Time-Series Data

Authors:
Zeeshan Syed;John Guttag
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 13
Cited 1

A Min-max Cut Algorithm for Graph Partitioning and Data Clustering

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Using the Fisher Kernel Method to Detect Remote Protein Homologies

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Learning to match and cluster large high-dimensional data sets for data integration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering Similar Multidimensional Trajectories

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Exact indexing of dynamic time warping

Knowledge and Information Systems
Estimating the Support of a High-Dimensional Distribution

Neural Computation
An efficient and accurate method for evaluating time series similarity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Clustering and symbolic analysis of cardiovascular signals: discovery and visualization of medically relevant patterns in long-term data using limited prior knowledge

EURASIP Journal on Applied Signal Processing
On the marriage of Lp-norms and edit distance

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning kernels from indefinite similarities

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Similarity-based Classification: Concepts and Algorithms

The Journal of Machine Learning Research
Motif discovery in physiological datasets: A methodology for inferring predictive elements

ACM Transactions on Knowledge Discovery from Data (TKDD)
Appropriate kernel functions for support vector machine learning with sequences of symbolic data

Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning

Finding time series discord based on bit representation clustering

Knowledge-Based Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

In medicine, one often bases decisions upon a comparative analysis of patient data. In this paper, we build upon this observation and describe similarity-based algorithms to risk stratify patients for major adverse cardiac events. We evolve the traditional approach of comparing patient data in two ways. First, we propose similarity-based algorithms that compare patients in terms of their long-term physiological monitoring data. Symbolic mismatch identifies functional units in long-term signals and measures changes in the morphology and frequency of these units across patients. Second, we describe similarity-based algorithms that are unsupervised and do not require comparisons to patients with known outcomes for risk stratification. This is achieved by using an anomaly detection framework to identify patients who are unlike other patients in a population and may potentially be at an elevated risk. We demonstrate the potential utility of our approach by showing how symbolic mismatch-based algorithms can be used to classify patients as being at high or low risk of major adverse cardiac events by comparing their long-term electrocardiograms to that of a large population. We describe how symbolic mismatch can be used in three different existing methods: one-class support vector machines, nearest neighbor analysis, and hierarchical clustering. When evaluated on a population of 686 patients with available long-term electrocardiographic data, symbolic mismatch-based comparative approaches were able to identify patients at roughly a two-fold increased risk of major adverse cardiac events in the 90 days following acute coronary syndrome. These results were consistent even after adjusting for other clinical risk variables.