An efficient approach to identify n-wMVD for eliminating data redundancy

Authors:
Sangeeta Viswanadham;Vatsavayi Valli Kumari
Affiliations:
Andhra University, Visakhapatnam, India;Andhra University, Visakhapatnam, India
Venue:
Proceedings of the CUBE International Information Technology Conference
Year:
2012

Citing 8
Cited 0

Query languages for nested relational databases

Nested relations and complex objects in databases
Duplicate record elimination in large data files

ACM Transactions on Database Systems (TODS)
Multivalued dependencies and a new normal form for relational databases

ACM Transactions on Database Systems (TODS)
Query processing utilizing dependencies and horizontal decomposition

SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Weak multivalued dependencies

PODS '84 Proceedings of the 3rd ACM SIGACT-SIGMOD symposium on Principles of database systems
Remarks on the algebra of non first normal form relations

PODS '82 Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems
Conditional functional dependencies for capturing data inconsistencies

ACM Transactions on Database Systems (TODS)
On Inferences ofWeak Multivalued Dependencies

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data Cleaning is a process for determining whether two or more records defined differently in database, represent the same real world object. Data Cleaning is a vital function in data warehouse preprocessing. It is found that the problem of duplication/redundancy is encountered frequently when large amounts of data collected from different sources is put in the warehouse. Eliminating redundancy in the data warehouse resolves conflicts in making wrong decisions. Data cleaning is also used to solve problem of "wastage of storage space". One way of eliminating redundancy is by retrieving similar records using tokens formed on prominent attributes. Another approach is to use Conditional Functional Dependencies (CFD's) to capture the consistency of data by combining semantically related data. Existing work on data cleaning do not deal with the case of multi-valued attributes. This paper deals with nesting based weak multi-valued dependencies (n-wMVD) which can handle multi-valued attributes and redundancy removal. Our contributions are of three fold (i) An approach to convert the given database to wMVD (ii) Implementation of one-nesting on wMVD in producing n-wMVD (iii) Improvement of n-wMVD by implementation of two-nesting to eliminate redundancy. The applicability of our approach was tested. The results are encouraging and are presented in the paper.