Top-Down Induction of Clustering Trees
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An Interactive Approach to Building Classification Models by Clustering and Cluster Validation
IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
A Visual Method of Cluster Validation with Fastmap
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Fast k-Nearest Neighbor Classification Using Cluster-Based Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
What are the grand challenges for data mining?: KDD-2006 panel report
ACM SIGKDD Explorations Newsletter
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
Effects of data set features on the performances of classification algorithms
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper, a decision cluster forest classification model is proposed for high dimensional data with multiple classes. A decision cluster forest (DCF) consists of a set of decision cluster trees, in which the leaves of each tree are clusters labeled with the same class that determines the class of new objects falling in the clusters. By recursively calling a variable weighting k -means algorithm, a decision cluster tree can be generated from a subset of the training data that contains the objects in the same class. The set of m decision cluster trees grown from the subsets of m classes constitute the decision cluster forest. Anderson-Darling test is used to determine the stopping condition of tree growing. A DCF classification (DCFC) model is selected from all leaves of the m decision cluster trees in the forest. A series of experiments on both synthetic and real data sets have shown that the DCFC model performed better in accuracy and scalability than the single decision cluster tree method and the methods of k -NN , decision tree and SVM. This new model is particularly suitable for large, high dimensional data with many classes.