Representing digital assets usingMPEG-21 Digital Item Declaration
International Journal on Digital Libraries
Efficient, automatic web resource harvesting
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
Hi-index | 0.00 |
HTTP and MIME, while sufficient for contemporary webpage access, do not provide enough forensic information to enable the long-term preservation of the resources they describe and transport. But what if the originating web server automatically provided preservation metadata encapsulated with the resource at time of dissemination? Perhaps the ingestion process could be streamlined, with additional forensic metadata available to future information archeologists. We have adapted an Apache web server implementation of OAI-PMH which can utilize third-party metadata analysis tools to provide a metadata-rich description of each resource. The resource and its forensic metadata are packaged together as a complex object, expressed in plain ASCII and XML. The result is a CRATE: a self-contained preservation-ready version of the resource, created at time of dissemination.