The Harvest information discovery and access system
Computer Networks and ISDN Systems
STARTS: Stanford proposal for Internet meta-searching
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
NCSTRL: design and deployment of a globally distributed digital library
Journal of the American Society for Information Science - digital libraries: Part 1
Synchronizing a database to improve freshness
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Journal of the American Society for Information Science and Technology - Special issue on the still the frontier: Information Science at the Millenium
Arc: an OAI service provider for cross-archive searching
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Core services in the architecture of the national science digital library (NSDL)
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Update Propagation Strategies for Improving the Quality of Data on the Web
Proceedings of the 27th International Conference on Very Large Data Bases
Enhanced Kepler Framework for Self-Archiving
ICPPW '02 Proceedings of the 2002 International Conference on Parallel Processing Workshops
Crawling the web: discovery and maintenance of large-scale web data
Crawling the web: discovery and maintenance of large-scale web data
Federating heterogeneous digital libraries by metadata harvesting
Federating heterogeneous digital libraries by metadata harvesting
Report on the metadata harvesting workshop at JCDL 2003
ACM SIGIR Forum
Efficient, automatic web resource harvesting
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
View-based model-driven architecture for enhancing maintainability of data access services
Data & Knowledge Engineering
Integration of wikipedia and a geography digital library
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Hi-index | 0.00 |
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) began as an alternative to distributed searching of scholarly eprint repositories. The model embraced by the OAI-PMH is that of metadata harvesting, where value-added services (by a "service provider") are constructed on cached copies of the metadata extracted from the repositories of the harvester's choosing. While this model dispenses with the well known problems of distributed searching, it introduces the problem of synchronization. Stated simply, this problem arises when the service provider's copy of the metadata does not match the metadata currently at the constituent repositories. We define some metrics for describing the synchronization problem in the OAI-PMH. Based on these metrics, we study the synchronization problem of the OAI-PMH framework and propose several approaches for harvesters to implement better synchronization. In particular, if a repository knows its update frequency, it can publish it in an OAI-PMH Identify response using an optional About container that borrows from RDF Site Syndication (RSS) Format.