ABBA: adaptive bicluster-based approach to impute missing values in binary matrices

Authors:
Alessandro Colantonio;Roberto Di Pietro;Alberto Ocello;Nino Vincenzo Verde
Affiliations:
Engiweb Security, Roma, Italy;Università di Roma Tre, Roma, Italy;Engiweb Security, Roma, Italy;Università di Roma Tre, Roma, Italy
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 11
Cited 3

Statistical analysis with missing data

Statistical analysis with missing data
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Clustering Binary Fingerprint Vectors with Missing Values for DNA Array Data Analysis

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure

IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Missing value estimation for DNA microarray gene expression data: local least squares imputation

Bioinformatics
Mining Approximate Frequent Itemsets from Noisy Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A cost-driven approach to role engineering

Proceedings of the 2008 ACM symposium on Applied computing
Cluster Analysis

Cluster Analysis
A formal framework to elicit roles with business meaning in RBAC systems

Proceedings of the 14th ACM symposium on Access control models and technologies
Optimal Boolean Matrix Decomposition: Application to Role Engineering

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

A new role mining framework to elicit business roles and to mitigate enterprise risk

Decision Support Systems
MFCluster: mining maximal fault-tolerant constant row biclusters in microarray dataset

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Role engineering: from theory to practice

Proceedings of the second ACM conference on Data and Application Security and Privacy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Missing values frequently pose problems in binary matrices analysis since they can hinder downstream analysis of the datasets. Despite the presence of many imputation methods that have been developed to substitute missing values with estimated values, these available techniques have some common disadvantages: they need to fix some parameters (e.g., number of patterns, number of rows to consider) to estimate missing values---with little theoretical support to determine these parameters---; and, missing values need to be recomputed from scratch as parameters change. In this paper we propose a novel algorithm (ABBA: Adaptive Bicluster-Based Approach) that does not have the above limitations. Further, a formal framework that justifies the rationales behind ABBA is detailed. Finally, experimental results over both synthetic and real data confirm the viability of our approach and the quality of the results, that overcomes the ones achieved by the main competing algorithm (KNN).