Digital Mountain: From Granite Archive to Global Access

Authors:
William Barrett;Luke Hutchison;Dallan Quass;Heath Nielson;Douglas Kennard
Affiliations:
-;-;-;-;-
Venue:
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Year:
2004

Citing 0
Cited 3

CMIC at INEX 2007: Book Search Track

Focused Access to XML Documents
Book search: indexing the valuable parts

Proceedings of the 2008 ACM workshop on Research advances in large digital book repositories
Efficient Language-Independent Retrieval of Printed Documents without OCR

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

Large-scale, multi-terabyte digital libraries are becoming feasible due to decreasing costs ofstorage, CPU, and bandwidth. However, costs associated with preparing content for input intothe library remain high due to the amount of human labor required. This paper describes theDigital Microfilm Pipeline -- a sequence of image processing operations used to populate a large-scale digital library from a "mountain" of microfilm and reduce the human labor involved. Essential parts of the pipeline include algorithms for document zoning and labeling, consensus-based template creation, reversal of geometric transformations and Just-In-Time Browsing, an interactive technique for progressive access of image content over a low-bandwidth medium. We also suggest more automated approaches to cropping, enhancement and data extraction.