Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Fedora: an architecture for complex objects and their relationships
International Journal on Digital Libraries
A retrospective look at Greenstone: lessons from the first decade
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Non-interactive OCR post-correction for giga-scale digitization projects
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Towards very large scale digital library building in greenstone using parallel processing
ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
Flexible design for simple digital library tools and services
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Hi-index | 0.00 |
DSpace, Fedora, and Greenstone are three widely used open source digital library systems. In this paper we report on scalability tests performed on these tools by ourselves and others. These range from repositories populated with synthetically produced data to real world deployment with content measured in millions of items. A case study is presented that details how one of the systems performed when used to produce fully-searchable newspaper collections containing in excess of 20 GB of raw text (2 billion words, with 60 million unique terms), 50 GB of metadata, and 570 GB of images.