A Comparison of Four Data Mining Models: Bayes, Neural Network, SVM and Decision Trees in Identifying Syndromes in Coronary Heart Disease

  • Authors:
  • Jianxin Chen;Yanwei Xing;Guangcheng Xi;Jing Chen;Jianqiang Yi;Dongbin Zhao;Jie Wang

  • Affiliations:
  • Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China;Guanganmen Hospital,Chinese Academy of Chinese Medical Science, 100053, Beijing, China;Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China;Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China;Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China;Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China;Guanganmen Hospital,Chinese Academy of Chinese Medical Science, 100053, Beijing, China

  • Venue:
  • ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Coronary heart disease (CHD) is a serious disease causing more and more morbidity and mortality. Combining western medicine and Traditional Chinese Medicine (TCM) to heal CHD becomes especially necessary for medical society today. Since western medicine faces some problems, like high cost and more side effects. TCM can be a complementary alternative to overcome these defects. Identification of what syndrome a CHD patient caught has been a challenging issue for medical society because the core of TCM is syndrome. In this paper, we carry out a large-scale clinical epidemiology to collect data with 1069 cases, each of which must be a CHD instance but may be diagnosed as different syndromes. We take blood stasis syndrome (frequency is 69%) as an example, employ four distinct kinds of data mining algorithms: Bayesian model; Neural Network; Support vector machine and Decision trees to classify the data and compare their performance. The results indicated that neural network is the best identifier with 88.6% accuracy on the holdout samples. The next is support vector machine with 82.5% accuracy, a slight higher than Bayesian model with 82.0% counterpart. The decision tree performs the worst, only 80.4%. We conclude that in identifying syndromes in CHD, neural network can provide a best insight to clinical application.