The impact of feature representation to the biclustering of symptoms-herbs in TCM

  • Authors:
  • Simon Poon;Zhe Luo;Runshun Zhang

  • Affiliations:
  • School of Information Technologies, University of Sydney, Sydney, Australia;School of Information Technologies, University of Sydney, Sydney, Australia;China Academy of Chinese Medical Sciences, Guananmen Hospital, Beijing, China

  • Venue:
  • PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional Chinese Medicine (TCM) is a holistic approach to medical treatment. Analysis and decision cannot be made in isolation, hence, the extraction of symptoms-herbs relationship is a crucial step to the research of the underlying TCM principle. Since this kind of relationship bears a lot of similarity with the gene-expression study in the microarray analysis, where the use of biclustering algorithms is common, it is logical to apply biclustering algorithms to the study of symptom-herb relationship. However, the choice of feature representation is a dominant factor in the success of any machine learning problem. This paper aims to understand the impact of different representation schemes in the biclustering of symptoms-herbs relationship. A bicluster is not helpful if the number of features is too large or too small. In order to get a desirable size for the biclusters, modified relative success ratio is considered to be the most appropriate one among the other four schemes. Some of the biclusters (using modified relative success ratio) do follow the therapeutic principle of TCM, while some biclusters with interesting feature combination that are worthwhile for clinical evaluation.