Similarity computing model of high dimension data for symptom classification of Chinese traditional medicine

Authors:
Jing Peng;Chang-jie Tang;Dong-qing Yang;Jing Zhang;Jian-jun Hu
Affiliations:
School of Computer Science and Engineering, Sichuan University, Chengdu 610065, China and School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;School of Computer Science and Engineering, Sichuan University, Chengdu 610065, China;School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;Chengdu Jiuheyuan Industry Company, Chengdu 610015, China;School of Computer Science and Engineering, Sichuan University, Chengdu 610065, China
Venue:
Applied Soft Computing
Year:
2009

Citing 20
Cited 0

Two algorithms for nearest-neighbor search in high dimensions

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Algorithm 97: Shortest path

Communications of the ACM
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Hierarchical subspace sampling: a unified framework for high dimensional data reduction, selectivity estimation and nearest neighbor search

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces

SIAM Journal on Computing
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error

IEEE Transactions on Pattern Analysis and Machine Intelligence
VGM: visual graph mining

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
The new Casper: query processing for location services without compromising privacy

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Some Effective Techniques for Naive Bayes Text Classification

IEEE Transactions on Knowledge and Data Engineering
Multi-class pattern classification using neural networks

Pattern Recognition
Neural network classification of homomorphic segmented heart sounds

Applied Soft Computing
Text classification: A least square support vector machine approach

Applied Soft Computing
Ensembling evidential k-nearest neighbor classifiers through multi-modal perturbation

Applied Soft Computing
Support feature machine for classification of abnormal brain activity

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Direct Discriminative Pattern Mining for Effective Classification

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Minimax-optimal classification with dyadic decision trees

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, researchers have paid more and more attention on data mining of practical applications. Aimed to the problem of symptom classification of Chinese traditional medicine, this paper proposes a novel computing model based on the similarities among attributes of high dimension data to compute the similarity between any tuples. This model assumes data attributes as basic vectors of m dimensions and each tuple as a sum vector of all the attribute-vectors. Based on the transcendental concept similarity information among attributes, it suggests a novel distance algorithm to compute the similarity distance of any pair of attribute-vectors. In this method, the computing of similarity between any tuples are turned to the formulas of attribute-vectors and their projections of each other, and the similarity between any pair of tuples can be worked out by computing these vectors and formulas. This paper also presents a novel classification algorithm based on the similarity computing model and successfully applies the algorithm into the symptom classification of Chinese traditional medicine. The efficiency of the algorithm is proved by extensive experiments.