The Harvest information discovery and access system
Computer Networks and ISDN Systems
5SL: a language for declarative specification and generation of digital libraries
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
DEByE - Date extraction by example
Data & Knowledge Engineering
The Web-DL environment for building digital libraries from the Web
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Structure-driven crawler generation by example
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The Web contains a huge volume of information, almost all unstructured and, therefore, difficult to manage. In Digital Libraries, however, information is explicitly organized, described, and managed. In this paper, we propose an architecture that allows the construction of digital libraries from the Web, using standard protocols and archival technologies, and incorporating powerful digital library and data extraction tools, thus benefiting from the breadth of the Web contents, but supporting services and organization available in digital libraries. The proposed architecture was applied to the Networked Digital Library of Theses and Dissertations, providing an important first step toward rapid construction of large DLs from the Web, as well as a large-scale solution for interoperability between independent digital libraries.