CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
Practical algorithms for image analysis: description, examples, and code
Practical algorithms for image analysis: description, examples, and code
Automatic categorization of figures in scientific documents
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Information graphics: an untapped resource for digital libraries
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Context-based multiscale classification of document images using wavelet coefficient distributions
IEEE Transactions on Image Processing
ChemXSeer: a digital library and data repository for chemical kinetics
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Extraction of relevant figures and tables for multi-document summarization
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Hi-index | 0.00 |
Figures in digital documents contain important information. Current digital libraries do not summarize and index information available within figures for document retrieval. We present our system on automatic categorization of figures and extraction of data from 2-D plots. A machine-learning based method is used to categorize figures into a set of predefined types based on image features. An automated algorithm is designed to extract data values from solid line curves in 2-D plots. The semantic type of figures and extracted data values from 2-D plots can be integrated with textual information within documents to provide more effective document retrieval services for digital library users. Experimental evaluation has demonstrated that our system can produce results suitable for real-world use.