Data genome: an abstract model for data evolution

  • Authors:
  • Deyou Tang;Jianqing Xi;Yubin Guo;Shunqi Shen

  • Affiliations:
  • School of Computer Science & Engineering, South China University of Technology, Guangzhou, China and Department of Computer Science & Technology, Hunan University of Technology, Zhuzhou, China;School of Computer Science & Engineering, South China University of Technology, Guangzhou, China;School of Computer Science & Engineering, South China University of Technology, Guangzhou, China;School of Computer Science & Engineering, South China University of Technology, Guangzhou, China

  • Venue:
  • ISICA'07 Proceedings of the 2nd international conference on Advances in computation and intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern information systems often process data that has been transferred, transformed or integrated from a variety of sources. In many application domains, information concerning the derivation of data items is crucial. Currently, a kind of metadata called data provenance is investigated by many researchers, but collection of provenance information must be maintained explicitly by dataset maintainer or specialized provenance management system. In this paper we investigate the problem of providing support of derivation information for applications in dataset itself. We put forward that every dataset has a unique data genome evolving with the evolution of dataset. Data genome is part of data and records derivation information for data actively. The characteristics of data genome show that the lineage of datasets can be uncovered by analyzing theirs data genomes. We also present computations of data genomes such as clone, transmit, mutate and introject to show how data genome evolves to provide derivation information from dataset itself.