CAIM Discretization Algorithm

Authors:
Lukasz A. Kurgan;Krzysztof J. Cios
Affiliations:
-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2004

Citing 12
Cited 62

Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
On the Handling of Continuous-Valued Attributes in Decision Tree Generation

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining methods for knowledge discovery

Data mining methods for knowledge discovery
Hybrid inductive machine learning: an overview of CLIP algorithms

New learning paradigms in soft computing
Feature Selection via Discretization

IEEE Transactions on Knowledge and Data Engineering
Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
The CN2 Induction Algorithm

Machine Learning
On Changing Continuous Attributes into Ordered Discrete Attributes

EWSL '91 Proceedings of the European Working Session on Machine Learning
Rule Induction with CN2: Some Recent Improvements

EWSL '91 Proceedings of the European Working Session on Machine Learning
Learning from Inconsistent and Noisy Data: The AQ18 Approach

ISMIS '99 Proceedings of the 11th International Symposium on Foundations of Intelligent Systems
CLIP4: hybrid inductive machine learning algorithm that generates inequality rules

Information Sciences: an International Journal - Special issue: Soft computing data mining

CLIP4: hybrid inductive machine learning algorithm that generates inequality rules

Information Sciences: an International Journal - Special issue: Soft computing data mining
A Discretization Algorithm Based on a Heterogeneity Criterion

IEEE Transactions on Knowledge and Data Engineering
A Parsimonious Constraint-based Algorithm to Induce Bayesian Network Structures from Data

ENC '05 Proceedings of the Sixth Mexican International Conference on Computer Science
A Self-Organizing Computing Network for Decision-Making in Data Sets with a Diversity of Data Types

IEEE Transactions on Knowledge and Data Engineering
A Distribution-Index-Based Discretizer for Decision-Making with Symbolic AI Approaches

IEEE Transactions on Knowledge and Data Engineering
Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees

Information Sciences: an International Journal
Fuzzy ARTMAP dynamic decay adjustment: An improved fuzzy ARTMAP model with a conflict resolving facility

Applied Soft Computing
A discretization algorithm based on Class-Attribute Contingency Coefficient

Information Sciences: an International Journal
Wrapper discretization by means of estimation of distribution algorithms

Intelligent Data Analysis
Selection Criteria for Fuzzy Unsupervised Learning: Applied to Market Segmentation

IFSA '07 Proceedings of the 12th international Fuzzy Systems Association world congress on Foundations of Fuzzy Logic and Soft Computing
Feature Selection in in vivo1H-MRS Single Voxel Spectra

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Rough sets approach to symbolic value partition

International Journal of Approximate Reasoning
Mining decision rules on data streams in the presence of concept drifts

Expert Systems with Applications: An International Journal
Ameva: An autonomous discretization algorithm

Expert Systems with Applications: An International Journal
A new approach to qualitative learning in time series

Expert Systems with Applications: An International Journal
A Discretization Process in Accordance with a Qualitative Ordered Output

Proceedings of the 2005 conference on Artificial Intelligence Research and Development
Forecasting New Customers' Behaviour by Means of a Fuzzy Unsupervised Method

Proceedings of the 2007 conference on Artificial Intelligence Research and Development
Outlier exploration and diagnostic classification of a multi-centre 1H-MRS brain tumour database

Neurocomputing
Robust Gene Selection from Microarray Data with a Novel Markov Boundary Learning Method: Application to Diabetes Analysis

ECSQARU '09 Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Evolutionary multi-feature construction for data reduction: A case study

Applied Soft Computing
Improved Comprehensibility and Reliability of Explanations via Restricted Halfspace Discretization

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Semantic labeling of compound nominalization in Chinese

MWE '07 Proceedings of the Workshop on a Broader Perspective on Multiword Expressions
Logic-based fuzzy networks: A study in system modeling with triangular norms and uninorms

Fuzzy Sets and Systems
Discretization of Time Series Dataset with a Genetic Search

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Feature and model selection with discriminatory visualization for diagnostic classification of brain tumors

Neurocomputing
Enhancing biomedical named entity classification using terabyte unlabeled data

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
A novel Chi2 algorithm for discretization of continuous attributes

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Knowledge discovery of remote sensing classification rules based on variable precision rough set

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
A novel image retrieval model based on the most relevant features

Knowledge-Based Systems
A discretization algorithm for uncertain data

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Software-defect localisation by mining dataflow-enabled call graphs

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A variable precision rough set approach to the remote sensing land use/cover classification

Computers & Geosciences
Review:

The Knowledge Engineering Review
A global unsupervised data discretization algorithm based on collective correlation coefficient

IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
An effective discretization based on Class-Attribute Coherence Maximization

Pattern Recognition Letters
A connectionist fuzzy case-based reasoning model

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
How good are the bayesian information criterion and the minimum description length principle for model selection? a bayesian network analysis

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
A new discretization algorithm based on range coefficient of dispersion and skewness for neural networks classifier

Applied Soft Computing
Diagnosis of chronic idiopathic inflammatory bowel disease using bayesian networks

CIARP'06 Proceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications
Improvement of decision accuracy using discretization of continuous attributes

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
A novel discretizer for knowledge discovery approaches based on rough sets

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Extending a hybrid CBR-ANN model by modeling predictive attributes using fuzzy sets

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
A divide-and-conquer discretization algorithm

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Extension of the generalization complexity measure to real valued input data sets

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Times series discretization using evolutionary programming

MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
Content vs. context for sentiment analysis: a comparative analysis over microblogs

Proceedings of the 23rd ACM conference on Hypertext and social media
Correlation maximisation-based discretisation for supervised classification

International Journal of Business Intelligence and Data Mining
A fuzzy-rough sets based compact rule induction method for classifying hybrid data

RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
Predictive combinations of monitor alarms preceding in-hospital code blue events

Journal of Biomedical Informatics
Data discretization using the extreme learning machine neural network

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
Research on Key Technology in Remote Education System of Spirit Diagnosing by Eye in TCM

International Journal of Distance Education Technologies
Feature Based Rule Learner in Noisy Environment Using Neighbourhood Rough Set Model

International Journal of Software Science and Computational Intelligence
UniDis: a universal discretization technique

Journal of Intelligent Information Systems
Hybrid wrapper-filter approaches for input feature selection using maximum relevance-minimum redundancy and artificial neural network input gain measurement approximation (ANNIGMA)

ACSC '11 Proceedings of the Thirty-Fourth Australasian Computer Science Conference - Volume 113
Ranking and selection of unsupervised learning marketing segmentation

Knowledge-Based Systems
Examination and comparison of conflicting data in granulated datasets: Equal width interval vs. equal frequency interval

Information Sciences: an International Journal
A real-time transportation prediction system

Applied Intelligence
Regularized Gaussian Mixture Model based discretization for gene expression data association mining

Applied Intelligence
Letters: A new approach for discretizing continuous attributes in learning systems

Neurocomputing
Inferring ECA-based rules for ambient intelligence using evolutionary feature extraction

Journal of Ambient Intelligence and Smart Environments
Compact classification of optimized Boolean reasoning with Particle Swarm Optimization

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Abstract--The task of extracting knowledge from databases is quite often performed by machine learning algorithms. The majority of these algorithms can be applied only to data described by discrete numerical or nominal attributes (features). In the case of continuous attributes, there is a need for a discretization algorithm that transforms continuous attributes into discrete ones. This paper describes such an algorithm, called CAIM (class-attribute interdependence maximization), which is designed to work with supervised data. The goal of the CAIM algorithm is to maximize the class-attribute interdependence and to generate a (possibly) minimal number of discrete intervals. The algorithm does not require the user to predefine the number of intervals, as opposed to some other discretization algorithms. The tests performed using CAIM and six other state-of-the-art discretization algorithms show that discrete attributes generated by the CAIM algorithm almost always have the lowest number of intervals and the highest class-attribute interdependency. Two machine learning algorithms, the CLIP4 rule algorithm and the decision tree algorithm, are used to generate classification rules from data discretized by CAIM. For both the CLIP4 and decision tree algorithms, the accuracy of the generated rules is higher and the number of the rules is lower for data discretized using the CAIM algorithm when compared to data discretized using six other discretization algorithms. The highest classification accuracy was achieved for data sets discretized with the CAIM algorithm, as compared with the other six algorithms.