Statistical analysis with missing data
Statistical analysis with missing data
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A condensed representation to find frequent patterns
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An Information Theoretic Approach to Rule Induction from Databases
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Treatment of Missing Values for Association Rules
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Using Association Rules for Completing Missing Data
HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
Operations for learning with graphical models
Journal of Artificial Intelligence Research
Discovery of characteristic patterns from tabular structured data including missing values
International Journal of Business Intelligence and Data Mining
Imputing missing values in nuclear safeguards evaluation by a 2-tuple computational model
ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
A neural-based approach for extending OLAP to prediction
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
When tackling real-life datasets, it is common to face the existence of scrambled missing values within data. Considered as "dirty data", it is usually removed during the pre-processing step of the KDD process. Starting from the fact that "making up this missing data is better than throwing it away", we present a new approach trying to complete the missing data. The main singularity of the introduced approach is that it sheds light on a fruitful synergy between generic basis of association rules and the topic of missing values handling. In fact, beyond interesting compactness rate, such generic association rules make it possible to get a considerable reduction of conflicts during the completion step. A new metric called "Robustness" is also introduced, and aims to select the robust association rule for the completion of a missing value whenever a conflict appears. Carried out experiments on benchmark datasets confirm the soundness of our approach. Thus, it reduces conflict during the completion step while offering a high percentage of correct completion accuracy.