Sync your data: update propagation for heterogeneous protein databases

Authors:
T. Claypool;A. Rundensteiner
Affiliations:
Department of Computer Science, University of Massachusetts, USA;Department of Computer Science, Worcester Polytechnic Institute, USA
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2005

Citing 39
Cited 0

Efficiently updating materialized views

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Maintaining views incrementally

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A semantic meta-modelling approach to schema transformation

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Using partial information to update materialized views

Information Systems
View maintenance in a warehousing environment

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Incremental maintenance of views with duplicates

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Change detection in hierarchically structured information

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Efficient view maintenance at data warehouses

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
SERF: schema evolution through an extensible, re-usable and flexible framework

Proceedings of the seventh international conference on Information and knowledge management
Development of meta databases for geospatial data in the WWW

Proceedings of the 6th ACM international symposium on Advances in geographic information systems
Graph theory and its applications

Graph theory and its applications
Timer-driven database triggers and alerters: semantics and a challenge

ACM SIGMOD Record
Updating XML

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Monitoring XML data on the Web

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Sangam - a solution to support multiple data models, their mappings and maintenance

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Computational Biology Database Digest: Data, Data Analysis, and Data Management

Distributed and Parallel Databases
Incremental Maintenance of Materialized Object-Oriented Views in MultiView: Strategies and Performance Evaluation

IEEE Transactions on Knowledge and Data Engineering
Management of Multiple Models in an Extensible Database Design Tool

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Representing and Querying Changes in Semistructured Data

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Graph Structured Views and Their Incremental Maintenance

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Object Exchange Across Heterogeneous Information Sources

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources

ISMB '98 Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology
Incremental Maintenance for Materialized Views over Semistructured Data

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Using Schema Matching to Simplify Heterogeneous Data Translation

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Change-Centric Management of Versions in an XML Warehouse

Proceedings of the 27th International Conference on Very Large Data Bases
Efficient Management of Multiversion Documents by Object Referencing

Proceedings of the 27th International Conference on Very Large Data Bases
The Use of Information Capacity in Schema Integration and Translation

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
KF-Diff+: Highly Efficient Change Detection Algorithm for XML Documents

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Theoretically Sound Transformations for Practical Database Design

Proceedings of the Sixth International Conference on Entity-Relationship Approach
Incremental Maintenance of Materialized Views

DEXA '97 Proceedings of the 8th International Conference on Database and Expert Systems Applications
TAX: A Tree Algebra for XML

DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
Incremental Maintenance of Schema-Restructuring Views

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
TIMBER: A native XML database

The VLDB Journal — The International Journal on Very Large Data Bases
Gangam: A Transformation Modeling Framework

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Integrating life sciences data-with a little Garlic

BIBE '00 Proceedings of the 1st IEEE International Symposium on Bioinformatics and Biomedical Engineering
Rondo: a programming platform for generic model management

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Detecting Changes in XML Documents

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Incremental Maintenance of Schema-Restructuring Views in SchemaSQL

IEEE Transactions on Knowledge and Data Engineering
Data warehouse scenarios for model management

ER'00 Proceedings of the 19th international conference on Conceptual modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

The traditional model of bench (wet) chemistry in many life sciences domain is today actively complimented by computer-based discoveries utilizing the growing number of online data sources. A typical computer-based discovery scenario for many life scientists includes the creation of local caches of pertinent information from multiple online resources such as Swissprot [Nucleic Acid Res. 1(28), 45–48 (2000)], PIR [Nucleic Acids Res. 28(1), 41–44 (2000)], PDB [The Protein DataBank. Wiley, New York (2003)], to enable efficient data analysis. This local caching of data, however, exposes their research and eventual results to the problems of data staleness, that is, cached data may quickly be obsolete or incorrect, dependent on the updates that are made to the source data. This represents a significant challenge to the scientific community, forcing scientists to be continuously aware of the frequent changes made to public data sources, and more importantly aware of the potential effects on their own derived data sets during the course of their research. To address this significant challenge, in this paper we present an approach for handling update propagation between heterogeneous databases, guaranteeing data freshness for scientists irrespective of their choice of data source and its underlying data model or interface. We propose a middle-layer–based solution wherein first the change in the online data source is translated to a sequence of changes in the middle-layer; next each change in the middle-layer is propagated through an algebraic representation of the translation between the source and the target; and finally the net-change is translated to a set of changes that are then applied to the local cache. In this paper, we present our algebraic model that represents the mapping of the online resource to the local cache, as well as our adaptive propagation algorithm that can incrementally propagate both schema and data changes from the source to the cache in a data model independent manner. We present a case study based on a joint ongoing project with our collaborators in the Chemistry Department at UMass-Lowell to explicate our approach.