Conceptual equivalence for contrast mining in classification learning

Authors:
Ying Yang;Xindong Wu;Xingquan Zhu
Affiliations:
Risk and Information Management Services, Micro Enterprises and Individuals, Australian Taxation Office, Australia;Department of Computer Science, University of Vermont, USA;Department of Computer Science and Engineering, Florida Atlantic University, USA
Venue:
Data & Knowledge Engineering
Year:
2008

Citing 20
Cited 5

C4.5: programs for machine learning

C4.5: programs for machine learning
Case-based reasoning: foundational issues, methodological variations, and system approaches

AI Communications
Discovering informative patterns and data cleaning

Advances in knowledge discovery and data mining
Detecting change in categorical data: mining contrast sets

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering the set of fundamental rule changes

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Correcting Noisy Data

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Experiments with Noise Filtering in a Medical Domain

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Probabilistic Noise Identification and Data Cleaning

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Visualization of Rule's Similarity using Multidimensional Scaling

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On detecting differences between groups

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dealing with predictive-but-unpredictable attributes in noisy data sources

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Combining proactive and reactive predictions for data streams

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams

Data Mining and Knowledge Discovery
Identifying and eliminating mislabeled training instances

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Post-analysis of learned rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

A unifying view on dataset shift in classification

Pattern Recognition
CLAP: Collaborative pattern mining for distributed information systems

Decision Support Systems
Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics

Expert Systems with Applications: An International Journal
Editorial: Occupation inference through detection and classification of biographical activities

Data & Knowledge Engineering
Repairing fractures between data using genetic programming-based feature extraction: A case study in cancer diagnosis

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning often occurs through comparing. In classification learning, in order to compare data groups, most existing methods compare either raw instances or learned classification rules against each other. This paper takes a different approach, namely conceptual equivalence, that is, groups are equivalent if their underlying concepts are equivalent while their instance spaces do not necessarily overlap and their rule sets do not necessarily present the same appearance. A new methodology of comparing is proposed that learns a representation of each group's underlying concept and respectively cross-exams one group's instances by the other group's concept representation. The innovation is fivefold. First, it is able to quantify the degree of conceptual equivalence between two groups. Second, it is able to retrace the source of discrepancy at two levels: an abstract level of underlying concepts and a specific level of instances. Third, it applies to numeric data as well as categorical data. Fourth, it circumvents direct comparisons between (possibly a large number of) rules that demand substantial effort. Fifth, it reduces dependency on the accuracy of employed classification algorithms. Empirical evidence suggests that this new methodology is effective and yet simple to use in scenarios such as noise cleansing and concept-change learning.