Information graphics: an untapped resource for digital libraries
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
TableSeer: automatic table metadata extraction and searching in digital libraries
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
An Automatic System for Extracting Figures and Captions in Biomedical PDF Documents
BIBM '11 Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Hi-index | 0.00 |
Academic papers contain multiple figures representing important findings and experimental results; we present a search engine specifically focused on figures in academic documents. This search engine allows users to search on figures in approximately 150,000 chemistry journal articles though the method is easily extendable to other domains. Our system indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor. Recall and precision performance of extracted figures is in the 80 to 90% range. We give the frame work for the extraction algorithm, architecture and ranking function.