Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid inductive machine learning: an overview of CLIP algorithms
New learning paradigms in soft computing
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
Feature Selection via Discretization
IEEE Transactions on Knowledge and Data Engineering
A Modified Chi2 Algorithm for Discretization
IEEE Transactions on Knowledge and Data Engineering
Machine Learning
On Changing Continuous Attributes into Ordered Discrete Attributes
EWSL '91 Proceedings of the European Working Session on Machine Learning
Learning from Inconsistent and Noisy Data: The AQ18 Approach
ISMIS '99 Proceedings of the 11th International Symposium on Foundations of Intelligent Systems
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
IEEE Transactions on Knowledge and Data Engineering
Building multi-way decision trees with numerical attributes
Information Sciences: an International Journal
An Extended Chi2 Algorithm for Discretization of Real Value Attributes
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Density-based clustering of uncertain data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Hierarchical Density-Based Clustering of Uncertain Data
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Approximation algorithms for clustering uncertain data
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Data Discretization Unification
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
A Survey of Uncertain Data Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
DTU: A Decision Tree for Uncertain Data
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A Framework for Clustering Uncertain Data Streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Naive Bayes Classification of Uncertain Data
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Correlation maximisation-based discretisation for supervised classification
International Journal of Business Intelligence and Data Mining
Hi-index | 0.00 |
This paper proposes a new discretization algorithm for uncertain data. Uncertainty is widely spread in real-world data. Numerous factors lead to data uncertainty including data acquisition device error, approximate measurement, sampling fault, transmission latency, data integration error and so on. In many cases, estimating and modeling the uncertainty for underlying data is available and many classical data mining algorithms have been redesigned or extended to process uncertain data. It is extremely important to consider data uncertainty in the discretization methods as well. In this paper, we propose a new discretization algorithm called UCAIM (Uncertain Class-Attribute Interdependency Maximization). Uncertainty can be modeled as either a formula based or sample based probability distribution function (pdf). We use probability cardinality to build the quanta matrix of these uncertain attributes, which is then used to evaluate class-attribute interdependency by adopting the redesigned ucaim criterion. The algorithm selects the optimal discretization scheme with the highest ucaim value. Experiments show that the usage of uncertain information helps UCAIM perform well on uncertain data. It significantly outperforms the traditional CAIM algorithm, especially when the uncertainty is high.