Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Lineage tracing for general data warehouse transformations
The VLDB Journal — The International Journal on Very Large Data Bases
Lineage retrieval for scientific data processing: a survey
ACM Computing Surveys (CSUR)
A survey of data provenance in e-science
ACM SIGMOD Record
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Entity Name System: The Back-Bone of an Open and Scalable Web of Data
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
An entity name system (ENS) for the semantic web
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Adapting prime number labeling scheme for directed acyclic graphs
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Hi-index | 0.05 |
The Entity Name System (ENS) is a service aiming at providing globally unique URIs for all kinds of real-world entities such as persons, locations and products, based on descriptions of such entities. Because entity descriptions available to the ENS for deciding on entity identity--Do two entity descriptions refer to the same real-world entity?--are changing over time, the system has to revise its past decisions: One entity has been given two different URIs or two entities have been attributed the same URI. The question we have to investigate in this context is then: How do we propagate entity decision revisions to the clients which make use of the URIs provided by the ENS? In this paper we propose a solution which relies on labelling the IDs with additional history information. These labels allow clients to locally detect deprecated URIs they are using and also merge IDs referring to the same real-world entity without needing to consult the ENS. Making update requests to the ENS only for the IDs detected as deprecated considerably reduces the number of update requests, at the cost of a decrease in uniqueness quality. We investigate how much the number of update requests decreases using ID history labelling, as well as how this impacts the uniqueness of the IDs on the client. For the experiments we use both artificially generated entity revision histories as well as a real case study based on the revision history of the Dutch and Simple English Wikipedia.