A Computational Approach to Edge Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Wavelets and subband coding
CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
Comparison of edge detectors: a methodology and initial study
Computer Vision and Image Understanding
The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Making large-scale support vector machine learning practical
Advances in kernel methods
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Use of the Hough transformation to detect lines and curves in pictures
Communications of the ACM
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries
IEEE Transactions on Pattern Analysis and Machine Intelligence
Introductory Techniques for 3-D Computer Vision
Introductory Techniques for 3-D Computer Vision
A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic document metadata extraction using support vector machines
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Knowledge-based derivation of document logical structure
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust document image understanding technologies
Proceedings of the 1st ACM workshop on Hardcopy document processing
A Parallel-Line Detection Algorithm Based on HMM Decoding
IEEE Transactions on Pattern Analysis and Machine Intelligence
Addressing the challenge of visual information access from digital image and video libraries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Automatic extraction of titles from general documents using machine learning
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Leveraging context to resolve identity in photo albums
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Content-based image retrieval: approaches and trends of the new age
Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Context-based multiscale classification of document images using wavelet coefficient distributions
IEEE Transactions on Image Processing
IEEE Transactions on Image Processing
Deriving knowledge from figures for digital libraries
Proceedings of the 16th international conference on World Wide Web
ChemXSeer: a digital library and data repository for chemical kinetics
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Segregating and extracting overlapping data points in two-dimensional plots
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Annotation suggestion and search for personal multimedia objects on the web
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Patent image retrieval: a survey
Proceedings of the 4th workshop on Patent information retrieval
Hi-index | 0.00 |
Figures are very important non-textual information contained in scientific documents. Current digital libraries do not provide users tools to retrieve documents based on the information available within the figures. We propose an architecture for retrieving documents by integrating figures and other information. The initial step in enabling integrated document search is to categorize figures into a set of pre-defined types. We propose several categories of figures based on their functionalities in scholarly articles. We have developed a machine-learning-based approach for automatic categorization of figures. Both global features, such as texture, and part features, such as lines, are utilized in the architecture for discriminating among figure categories. The proposed approach has been evaluated on a testbed document set collected from the CiteSeer scientific literature digital library. Experimental evaluation has demonstrated that our algorithms can produce acceptable results for realworld use. Our tools will be integrated into a scientific document digital library.