Preserving privacy whilst integrating data: Applied to criminal justice

Authors:
Sunil Choenni;Jan van Dijk;Frans Leeuw
Affiliations:
(Correspd. Tel.: +31 70 370 6466/ Fax: +31 70 370 7948/ E-mail: r.choenni@minjus.nl) Ministry of Justice, WODC, PO Box 20301, 2500 EH, The Hague, The Netherlands;Ministry of Justice, WODC, PO Box 20301, 2500 EH, The Hague, The Netherlands;Ministry of Justice, WODC, PO Box 20301, 2500 EH, The Hague, The Netherlands
Venue:
Information Polity - Government 2.0: Making Connections between citizens, data and government
Year:
2010

Citing 13
Cited 3

Federated database systems for managing distributed, heterogeneous, and autonomous databases

ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
The entity-relationship model—toward a unified view of data

ACM Transactions on Database Systems (TODS) - Special issue: papers from the international conference on very large data bases: September 22–24, 1975, Framingham, MA
Investigative Data Mining for Security and Criminal Detection

Investigative Data Mining for Security and Criminal Detection
A Distance-Based Approach to Entity Reconciliation in Heterogeneous Databases

IEEE Transactions on Knowledge and Data Engineering
Flexible Relation: An Approach for Integrating Data from Multiple, Possibly Inconsistent Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Index Selection in Relational Databases

ICCI '93 Proceedings of the Fifth International Conference on Computing and Information
From databases to dataspaces: a new abstraction for information management

ACM SIGMOD Record
Integrating heterogeneous multidimensional databases

SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Enterprise information mashups: integrating information, simply

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Business Information Systems: Technology, Development and Management for the E-business

Business Information Systems: Technology, Development and Management for the E-business
Intel Mash Maker: join the web

ACM SIGMOD Record
A citizen privacy protection model for e-government mashup services

dg.o '08 Proceedings of the 2008 international conference on Digital government research
Design and implementation of a forecasting tool of justice chains

LawTech '07 Proceedings of the Fifth IASTED International Conference on Law and Technology

Public safety mashups to support policy makers

EGOVIS'10 Proceedings of the First international conference on Electronic government and the information systems perspective
Exploring process barriers to release public sector information in local government

Proceedings of the 6th International Conference on Theory and Practice of Electronic Governance
Sharing confidential data for algorithm development by multiple imputation

Proceedings of the 25th International Conference on Scientific and Statistical Database Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

For many standard as well as emerging criminal law Web 2.0 applications, such as the development of mashups and dataspace systems, privacy preserving data integration is of crucial importance. In many organizations different databases contain different kinds of data concerning the same entity. This may have several good reasons. However, to have an integral and unified view of an entity, data reconciliation is of crucial importance. In this paper, we present an approach for data reconciliation that is based on available schemata of data sources and the content of the sources. The different schemata of data sources are used to determine what parts of the schemata pertain to the same entity type. The content of the sources is used to determine the association between different attributes stored in different sources. In establishing the relationships between different attributes, we have exploited the knowledge of domain experts as well. On the basis of the collected information, we identify a common set of attributes with regard to the data sources. A similarity function is associated to each attribute, which takes a record from each data source as input and computes a similarity value as output expressing how "similar" the records are. Depending on the similarity value, we decide whether or not to reconcile two entities. We illustrate the effectiveness of our approach by means of a real-life case in the field of police and justice. Our approach can be applied to support the development of a wide variety of criminal law applications, such as data warehouses, mashups, and dataspace systems.