Programming with POSIX threads
Programming with POSIX threads
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Classifying large data sets using SVMs with hierarchical clusters
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An Effective Support Vector Machines (SVMs) Performance Using Hierarchical Clustering
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Fast SVM Training Algorithm with Decomposition on Very Large Data Sets
IEEE Transactions on Pattern Analysis and Machine Intelligence
Core Vector Machines: Fast SVM Training on Very Large Data Sets
The Journal of Machine Learning Research
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
A new intrusion detection system using support vector machines and hierarchical clustering
The VLDB Journal — The International Journal on Very Large Data Bases
Enhancing prototype reduction schemes with recursion: a method applicable for "large" data sets
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An experimental bias-variance analysis of SVM ensembles based on resampling techniques
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Data mining a diabetic data warehouse
Artificial Intelligence in Medicine
Analysis of diabetic patients through their examination history
Expert Systems with Applications: An International Journal
Applying a BP neural network model to predict the length of hospital stay
HIS'13 Proceedings of the second international conference on Health Information Science
Review: Knowledge discovery in medicine: Current issue and future trend
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
This research utilizes the national Healthcare Cost & Utilization Project (HCUP-3) databases to construct Support Vector Machine (SVM) classifiers to predict clinical charge profiles, including hospital charges and length of stay (LOS), for patients diagnosed with heart and circulatory disease, diabetes and cancer, respectively. Clinical charge profiles predictions can provides relevant clinical knowledge for healthcare policy makers to effectively manage healthcare services and costs at the national, state, and local levels. Despite its solid mathematical foundation and promising experimental results, SVM is not favorable for large-scale data mining tasks since its training time complexity is at least quadratic to the number of samples. Furthermore, traditional SVM classification algorithms cannot build an effective SVM when different data distribution patterns are intermingled in a large dataset. In order to enhance SVM training for large, complex and noisy healthcare datasets, we propose the Multi-level Support Vector Machine (MLSVM) that organizes the dataset as clusters in a tree to produce better partitions for more effective SVM classification. The MLSVM model utilizes multiple SVMs, each of which learns the local data distribution patterns in a cluster efficiently. A decision fusion algorithm is used to generate an effective global decision that incorporates local SVM decisions at different levels of the tree. Consequently, MLSVM can handle complex and often conflicting data distributions in large datasets more effectively than the single-SVM based approaches and the multiple SVM systems. Both the combined 5x2-fold cross validation F test and the independent test show that classification performance of MLSVM is much superior to that of a CVM, ACSVM and CSVM based on three popular performance evaluation metrics. In this work, CSVM and MLSVM are parallelized to speed up the slow SVM training process for very large and complex datasets. Running time analysis shows that MLSVM can accelerate SVM's training process noticeably when the parallel algorithm is employed.