Application of clustering to estimate missing data and improve data integrity

Authors:
R. C. T. Lee;J. R. Slagle;C. T. Mong
Affiliations:
-;-;-
Venue:
ICSE '76 Proceedings of the 2nd international conference on Software engineering
Year:
1976

Citing 0
Cited 3

Clustering incomplete relational data using the non-Euclidean relational fuzzy c-means algorithm

Pattern Recognition Letters
An Integrated Data Preparation Scheme for Neural Network Data Analysis

IEEE Transactions on Knowledge and Data Engineering
An overview of recent data base research

ACM SIGMIS Database

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two problems in the use of computerized data base systems are: (1) How can we estimate the values for missing data? (2) How can we improve data integrity, that is, reduce the number of errors in the data? The tool that we introduce to attack these problems is clustering analysis. Experimental results indicate that our method is feasible. Our algorithm detected an error in the book “Weyer's Warships of the World 1969.” Each of the approximately 2000 warships listed in the book has 18 variables associated with it. It would be difficult for a person to find errors in the book. Our methods do not require any a priori knowledge about the data, for example, about warships.