Mining Very Large Databases with Parallel Processing
Mining Very Large Databases with Parallel Processing
Advances in Distributed and Parallel Knowledge Discovery
Advances in Distributed and Parallel Knowledge Discovery
Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A decade of progress in indexing and mining large time series databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
iSAX: indexing and mining terabyte sized time series
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Disk Aware Discord Discovery: Finding Unusual Time Series in Terabyte Sized Datasets
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Time series shapelets: a new primitive for data mining
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Managing massive time series streams with multi-scale compressed trickles
Proceedings of the VLDB Endowment
Patient-Specific Seizure Detection from Intra-cranial EEG Using High Dimensional Clustering
ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Recent advances in mining time series data
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Hi-index | 0.00 |
Epilepsy is a chronic neurological disorder characterized by recurrent, unprovoked seizures that manifest in a variety of ways, including emotional or behavioral disturbances, convulsive movements, and loss of awareness. The problem of prediction of epileptic seizures is hard and most algorithms do not perform better than a random predictor [20]. An important reason why studies so far have been less than successful is that electroencephalogram (EEG) is not recorded at the granularity of the seizure generation process. Our collaborators at the Columbia University Medical School (CUMC) have been involved in a clinical trial which entails implanting a Micro-Electrode Array directly into the neocortex of epilepsy patients undergoing surgery to remove the portion of the brain from where seizures originate. The 96-contact grid allows researchers to record at 30 KHz/channel which is a very high resolution data collection procedure compared to known state-of-the-art techniques and yields both local field and action potential data (.5 TB per patient per day). This large volume of data poses challenges for knowledge discovery and mining. In this paper, we describe the steps required for processing the EEG signal and extraction of features; we present a parallel design for scaling up processing on multi-core machines and an in-house cluster. Initial benchmarking results indicate that approximately 6-cores of a machine (processing speed of 2.7 GHz, 32 GB RAM, moderate workload) is sufficient to process a 5 minute chunk of data from 96 channels in approximately 12 mins. Encouraged by these results, we discuss design of other machine learning algorithms for learning from the data.