Scalability of databases for digital libraries

Authors:
John Chmura;Nattakarn Ratprasartporn;Gultekin Ozsoyoglu
Affiliations:
Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio;Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio;Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio
Venue:
ICADL'05 Proceedings of the 8th international conference on Asian Digital Libraries: implementing strategies and sharing experiences
Year:
2005

Citing 5
Cited 1

Information Retrieval Systems: Theory and Implementation

Information Retrieval Systems: Theory and Implementation
Querying web resources with metadata in a database

Querying web resources with metadata in a database
A Tree-Structured Query Interface for Querying Semi-Structured Data

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Querying web metadata: Native score management and text support in databases

ACM Transactions on Database Systems (TODS)
Sideway value algebra for object-relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Context-based literature digital collection search

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Search engines of main-stream literature digital libraries such as ACM Digital Library, Google Scholar, and PubMed employ file-based systems, and provide users with a basic boolean keyword search functionalities. As a result, new and powerful querying capabilities are not easy to implement on top of such systems, and not provided. In comparison, query languages of database systems traditionally have high expressive power. This paper evaluates the scalability of the approach of deploying relational databases as backend systems to digital libraries, and, thus, making use of the query languages and the query processing capabilities of database query engines for literature digital libraries. To evaluate our approach, we built a scalable prototype digital library built on top of a relational database management system, and its advanced query interface which allows users to specify dynamic text and path queries in an intuitive, hierarchical manner. This paper evaluates the scalability of two search query processing approaches, namely, ad-hoc queries, pre-compiled queries (stored-procedures). We demonstrate that, with reasonably priced hardware, we are able to build an RDBMS-based digital library search engine that can scale to handle millions of queries per day.