Combining the Strength of Pattern Frequency and Distance for Classification

Authors:
Jinyan Li;Kotagiri Ramamohanarao;Guozhu Dong
Affiliations:
-;-;-
Venue:
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Year:
2001

Citing 10
Cited 6

Instance-Based Learning Algorithms

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Extending naïve Bayes classifiers using long itemsets

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Making use of the most expressive jumping emerging patterns for classification

Knowledge and Information Systems
Instance-Based Classification by Emerging Patterns

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research
Average-case analysis of a nearest neighbor algorthim

IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2

RIONA: A Classifier Combining Rule Induction and k-NN Method with Automated Selection of Optimal Neighbourhood

ECML '02 Proceedings of the 13th European Conference on Machine Learning
A Multi-Strategy Approach to KNN and LARM on Small and Incrementally Induced Prediction Knowledge

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Contrast pattern mining and its applications

ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Combination of metric-based and rule-based classification

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
Analogy-based reasoning in classifier construction

Transactions on Rough Sets IV
RIONA: A New Classification System Combining Rule Induction and Instance-Based Learning

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Supervised classification involves many heuristics, including the ideas of decision tree, k-nearest neighbour (k-NN), pattern frequency, neural network, and Bayesian rule, to base induction algorithms. In this paper, we propose a new instance-based induction algorithm which combines the strength of pattern frequency and distance. We define a neighbourhood of a test instance. If the neighbourhood contains training data, we use k-NN to make decisions. Otherwise, we examine the support (frequency) of certain types of subsets of the test instance, and calculate support summations for prediction. This scheme is intended to deal with outliers: when no training data is near to a test instance, then the distance measure is not a proper predictor for classification. We present an effective method to choose an "optimal" neighbourhood factor for a given data set by using a guidance from a partial training data. In this work, we find that our algorithm maintains (sometimes exceeds) the outstanding accuracy of k-NN on data sets containing pure continuous attributes, and that our algorithm greatly improves the accuracy of k-NN on data sets containing a mixture of continuous and categorical attributes. In general, our method is much superior to C5.0.