Assessing quality dynamics in unsupervised metadata extraction for digital libraries

Authors:
Alexander Ivanyukovich;Maurizio Marchese;Patrick Reuther
Affiliations:
University of Trento, Department of Information and Communication Technology, Trento, Italy;University of Trento, Department of Information and Communication Technology, Trento, Italy;University of Trier, Department for Databases and Information Systems, Trier, Germany
Venue:
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Year:
2007

Citing 3
Cited 0

Automatic document metadata extraction using support vector machines

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Bibliographic attribute extraction from erroneous references based on a statistical model

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Survey on test collections and techniques for personal name matching

International Journal of Metadata, Semantics and Ontologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current research in large-scale information management systems is focused on unsupervised methods and techniques for information processing. Such approaches support scalability in regard to present-day exponential growth in information processing needs. In this paper we focus on the problem of automated quality evaluation of a completely unsupervised metadata extraction process in the Digital Libraries domain. In particular, we investigate resulting metadata quality applying specific extraction methodology for scientific documents. We propose and discuss precise quality metrics and measure the dynamics of such quality metrics as a function of the extracted information from the repository and size of the repository.