Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
C4.5: programs for machine learning
C4.5: programs for machine learning
Tissue classification with gene expression profiles
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
A discrete-valued clustering algorithm with applications to biomolecular data
Information Sciences: an International Journal
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
High-Order Pattern Discovery from Discrete-Valued Data
IEEE Transactions on Knowledge and Data Engineering
Pattern Discovery by Residual Analysis and Recursive Partitioning
IEEE Transactions on Knowledge and Data Engineering
Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Interval Classifier for Database Mining Applications
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
From Association to Classification: Inference Using Weight of Evidence
IEEE Transactions on Knowledge and Data Engineering
Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A global optimal algorithm for class-dependent discretization of continuous data
Intelligent Data Analysis
Typicality, Diversity, and Feature Pattern of an Ensemble
IEEE Transactions on Computers
Simultaneous Pattern and Data Clustering for Pattern Cluster Analysis
IEEE Transactions on Knowledge and Data Engineering
Pattern discovery: a data driven approach to decision support
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A novel evolutionary data mining algorithm with applications to churn prediction
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
In business and industry today, large databases with mixed data types (continuous and categorical) are very common. There are great needs to discover patterns from them for knowledge interpretation and understanding. In the past, for classification, this problem is solved as a discrete data problem by first discretizing the continuous data based on the class-attribute interdependence relationship. However, so far no proper solution exists when class information is unavailable. Hence, important pattern post-processing tasks such as pattern clustering and summarization cannot be applied to mixed-mode data. This paper presents a new method for solving the problem. It is based on two essential concepts. (1) Though class information is absent, yet for a correlated dataset, the attribute with the strongest interdependence with others in the group can be used to drive the discretization of the continuous data. (2) For a large database, correlated attribute groups must first be obtained by attribute clustering before (1) can be applied. Based on (1) and (2), pattern discovery methods are developed for mixed-mode data. Extensive experiments using synthetic and real world data were conducted to validate the usefulness and effectiveness of the proposed method.