Making data structures persistent
Journal of Computer and System Sciences - 18th Annual ACM Symposium on Theory of Computing (STOC), May 28-30, 1986
Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Fast algorithms for the unit cost editing distance between trees
Journal of Algorithms
Randomized algorithms
Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Meaningful change detection in structured data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The Tree-to-Tree Correction Problem
Journal of the ACM (JACM)
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Proceedings of the 10th international conference on World Wide Web
Database Management Systems
Representing and Querying Changes in Semistructured Data
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Comparing Hierarchical Data in External Memory
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Change-Centric Management of Versions in an XML Warehouse
Proceedings of the 27th International Conference on Very Large Data Bases
Efficient Management of Multiversion Documents by Object Referencing
Proceedings of the 27th International Conference on Very Large Data Bases
Effective timestamping in databases
The VLDB Journal — The International Journal on Very Large Data Bases
YFilter: Efficient and Scalable Filtering of XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Archiving web site resources: a records management view
Proceedings of the 15th international conference on World Wide Web
Processing queries on tree-structured data efficiently
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Provenance management in curated databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Provenance and Annotation for Visual Exploration Systems
IEEE Transactions on Visualization and Computer Graphics
Bridging relational database history and the web: the XML approach
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
Mapping-driven XML transformation
Proceedings of the 16th international conference on World Wide Web
Journal of Biomedical Informatics
A formal model of annotations of digital content
ACM Transactions on Information Systems (TOIS)
Weaving temporal and reliability aspects into a schema tapestry
Data & Knowledge Engineering
Temporal queries and version management in XML-based document archives
Data & Knowledge Engineering
XArch: archiving scientific and reference data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ViP: A User-Centric View-Based Annotation Framework for Scientific Data
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
ArchIS: an XML-based approach to transaction-time temporal database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Sorting hierarchical data in external memory for archiving
Proceedings of the VLDB Endowment
User-Centric Annotation Management for Biological Data
Provenance and Annotation of Data and Processes
How to edit gigabyte XML files on a mobile phone with XAS, RefTrees, and RAXS
Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
Modeling and querying provenance by extending CIDOC CRM
Distributed and Parallel Databases
Data genome: an abstract model for data evolution
ISICA'07 Proceedings of the 2nd international conference on Advances in computation and intelligence
Development of foundation models for Internet of Things
Frontiers of Computer Science in China
Data aspects in a relational database
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
LIVE: a lineage-supported versioned DBMS
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
The Foundations for Provenance on the Web
Foundations and Trends in Web Science
Supporting complex changes in evolving interrelated web databanks
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems - Volume Part I
Aspect-oriented relational algebra
Proceedings of the 14th International Conference on Extending Database Technology
Efficient storage and temporal query evaluation in hierarchical data archiving systems
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
What the web has done for scientific data – and what it hasn’t
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
A system architecture as a support to a flexible annotation service
DELOS'04 Proceedings of the 6th Thematic conference on Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures
The implication problem for 'closest node' functional dependencies in complete XML documents
Journal of Computer and System Sciences
A web-based transformation system for massive scientific data
WISE'06 Proceedings of the 7th international conference on Web Information Systems
SliceSort: efficient sorting of hierarchical data
Proceedings of the 21st ACM international conference on Information and knowledge management
DeltaNI: an efficient labeling scheme for versioned hierarchical data
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Partial view selection for evolving social graphs
First International Workshop on Graph Data Management Experiences and Systems
Scalable reconstruction of RDF-archived relational databases
Proceedings of the Fifth Workshop on Semantic Web Information Management
BNCOD'13 Proceedings of the 29th British National conference on Big Data
Hi-index | 0.00 |
Archiving is important for scientific data, where it is necessary to record all past versions of a database in order to verify findings based upon a specific version. Much scientific data is held in a hierachical format and has a key structure that provides a canonical identification for each element of the hierarchy. In this article, we exploit these properties to develop an archiving technique that is both efficient in its use of space and preserves the continuity of elements through versions of the database, something that is not provided by traditional minimum-edit-distance diff approaches. The approach also uses timestamps. All versions of the data are merged into one hierarchy where an element appearing in multiple versions is stored only once along with a timestamp. By identifying the semantic continuity of elements and merging them into one data structure, our technique is capable of providing meaningful change descriptions, the archive allows us to easily answer certain temporal queries such as retrieval of any specific version from the archive and finding the history of an element. This is in contrast with approaches that store a sequence of deltas where such operations may require undoing a large number of changes or significant reasoning with the deltas. A suite of experiments also demonstrates that our archive does not incur any significant space overhead when contrasted with diff approaches. Another useful property of our approach is that we use XML format to represent hierarchical data and the resulting archive is also in XML. Hence, XML tools can be directly applied on our archive. In particular, we apply an XML compressor on our archive, and our experiments show that our compressed archive outperforms compressed diff-based repositories in space efficiency. We also show how we can extend our archiving tool to an external memory archiver for higher scalability and describe various index structures that can further improve the efficiency of some temporal queries on our archive.