Unified access to heterogeneous data in cultural heritage

Authors:
Marijn Koolen;Avi Arampatzis;Jaap Kamps;Vincent de Keijzer;Nir Nussbaum
Affiliations:
University of Amsterdam, The Netherlands;University of Amsterdam, The Netherlands;University of Amsterdam, The Netherlands;Haags Gemeentemuseum, The Hague, The Netherlands;ISLA, University of Amsterdam, The Netherlands
Venue:
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Year:
2007

Citing 12
Cited 1

Document length normalization

Information Processing and Management: an International Journal - Special issue: history of information science
Statistical inference in retrieval effectiveness evaluation

Information Processing and Management: an International Journal
The Importance of Prior Probabilities for Entry Page Search

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Searching the workplace web

WWW '03 Proceedings of the 12th international conference on World Wide Web
Searching XML documents via XML fragments

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Length normalization in XML retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Parsimonious language models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Challenges in enterprise search

ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SIGIR workshop report: the SIGIR heterogeneous and distributed information retrieval workshop

ACM SIGIR Forum
Articulating information needs in XML query languages

ACM Transactions on Information Systems (TOIS)

Enrichment and structuring of archival description metadata

LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the prototypical problem of a cultural heritage institution with the ambition to disclose all of its content in a single, unified system. Like other enterprises, these institutions have heterogeneous collections distributed over multiple legacy systems. Our approach is to turn the metadata retrieval problem into a free-text retrieval problem by an unconditional merging of the heterogeneous sub-collections and flattening of all metadata structures. We investigate the viability of the approach by an extensive case study of a large museum. Our main findings are as follows: First, by converting all digital content to text, and indexing it with a standard IR system, we can effectively build a unified system providing access to all data. Second, an initial empirical evaluation shows superior performance in comparison with the legacy systems currently in use by the institute. Therefore, our third and overall finding is that our approach is a viable option to give access to heterogeneous collections.