A vector space model for automatic indexing
Communications of the ACM
Interactive Web Applications with Tcl/Tk
Interactive Web Applications with Tcl/Tk
Locating and Recognizing Text in WWW Images
Information Retrieval
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
White-Box Evaluation of Computer Vision Algorithms through Explicit Decision-Making
ICVS '09 Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems
Leveraging the CAPTCHA problem
HIP'05 Proceedings of the Second international conference on Human Interactive Proofs
Hi-index | 0.00 |
Many large collections of document images are now becoming available online as part of digital library initiatives, fueled by the explosive growth of the World Wide Web. In this paper, we examine protocols and system-related issues that arise in attempting to make use of these new resources, both as a target application (building better search engines) and as a way of overcoming the problem of acquiring ground-truth to support experimental document analysis research. We also report on our experiences running two simple tests involving data drawn from one such collection. The potential synergies between document analysis and digital libraries could lead to substantial benefits for both communities.