Scalable Digital Libraries Based on NCSTRL/Dienst

Authors:
Kurt Maly;Mohammad Zubair;Hesham Anan;Dun Tan;Yunchuan Zhang
Affiliations:
-;-;-;-;-
Venue:
ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Year:
2000

Citing 12
Cited 1

Wide Area Technical Report Service: technical reports online

Communications of the ACM
Making a digital library: the contents of the CORE project

ACM Transactions on Information Systems (TOIS)
Java servlet programming

Java servlet programming
Guest Editors' Introduction: Building Large-Scale Digital Libraries

Computer
Federating Diverse Collections of Scientific Literature

Computer
Dienst: Building a Production technical Report Server

ADL '95 Selected Papers from the Digital Libraries, Research and Technology Advances
NCSTRL+: Adding Multi-Discipline and Multi-Genre Support to the Dienst Protocol Using Clusters and Buckets

ADL '98 Proceedings of the Advances in Digital Libraries Conference
The NCSTRL Approach to Open Architecture for the Confederated Digital Library

The NCSTRL Approach to Open Architecture for the Confederated Digital Library
Smart Objects, Dumb Archives: A User-Centric, Layered Digital Library Framework

Smart Objects, Dumb Archives: A User-Centric, Layered Digital Library Framework
National HPCC Software Exchange (NHSE): Uniting the High PerformanceComputing and Communications Community

National HPCC Software Exchange (NHSE): Uniting the High PerformanceComputing and Communications Community
The Networked Computer Science Technical Report Library

The Networked Computer Science Technical Report Library
A Characterization Study of NCSTRL Distributed Searching

A Characterization Study of NCSTRL Distributed Searching

A framework for analysis and design of software reference architectures

Information and Software Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

NCSTRL (The Networked Computer Science Technical Report Library) is a successful digital library for scientific and technical information. It uses the Dienst protocol that was developed by ARPA-funded CS-TR project. We encountered several problems while implementing NCSTRL based large-scale libraries: UPS for Los Alamos and JDL for JTASC. The document collection for these libraries can range from several hundred thousands to few millions. The first problem we found that the native Dienst implementation does not scale beyond approximately 30,000 records. Secondly we found that the implementation is tightly coupled to the Unix platform. Finally, for a large number of hits the NCSTRL search interface support is limited in terms of usability. To address these problems, we replaced the Dienst repository service implementation with an Oracle-based implementation using servlet technology. The Oracle database stores the index information (metadata) and is partitioned horizontally to speed searching through different archives. Furthermore, indexes were built in order to speed the search by different key items such as the author name, the title and the abstract. Our implementation significantly reduced the average wait time for a user for searches that resulted in a large number of hits. In addition, we get all the other benefits of using servlet technology such as efficiency and portability. In this paper, we present the performance results of the new implementation and compare it with that of the implementation of the Dienst protocol in NCSTRL.