A graphical query language supporting recursion
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Expressing structural hypertext queries in graphlog
HYPERTEXT '89 Proceedings of the second annual ACM conference on Hypertext
A Graph-Oriented Object Database Model
IEEE Transactions on Knowledge and Data Engineering
Why and Where: A Characterization of Data Provenance
ICDT '01 Proceedings of the 8th International Conference on Database Theory
GraphDB: Modeling and Querying Graphs in Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Chimera: AVirtual Data System for Representing, Querying, and Automating Data Derivation
SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Declarative specification of Web sites with S
The VLDB Journal — The International Journal on Very Large Data Bases
UnQL: a query language and algebra for semistructured data based on structural recursion
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
The provenance of a piece of data refers to knowledge about its origin, in terms of the entities and actors involved in its creation, e.g. data sources used, operations carried out on them, and users enacting those operations. Provenance is used to better understand the data and the context of its production, and to assess its reliability, by asserting whether correct procedures were followed. Providing evidence for validating research is of particular importance in the biomedical domain, where the strength of the results depends on the data sources and processes used. In recent times, previously manual processes have become fully or semi-automated, e.g. clinical trial recruitment, epidemiological studies, diagnosis making. The latter is typically achieved through interactions of heterogeneous software systems in multiple settings (hospitals, clinics, academic and industrial research organisations). Provenance traces of these software need to be integrated in a consistent and meaningful manner, but since these software systems rarely share a common platform, the provenance interoperability between them has to be achieved on the level of conceptual models. It is a non-trivial matter to determine where to start in making a biomedical software system provenance-aware. In this paper, we specify recommendations to developers on how to approach provenance modelling, capture, security, storage and querying, based on our experiences with two large-scale biomedical research projects: Translational Research and Patient Safety in Europe (TRANSFoRm) and Electronic Health Records for Clinical Research (EHR4CR). While illustrated with concrete issues encountered, the recommendations are of a sufficiently high level so as to be reusable across the biomedical domain.