On-Line Recognition of UML Diagrams
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
An efficient parts-based near-duplicate and sub-image retrieval system
Proceedings of the 12th annual ACM international conference on Multimedia
Matching slides to presentation videos using SIFT and scene background matching
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
TableSeer: automatic table metadata extraction and searching in digital libraries
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Automatic Extraction of Data from 2-D Plots in Documents
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
ChemXSeer: a digital library and data repository for chemical kinetics
Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience
Features for image retrieval: an experimental comparison
Information Retrieval
Improving the SIFT descriptor with smooth derivative filters
Pattern Recognition Letters
Application of Image SIFT Features to the Context of CBIR
CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 04
Natural language processing versus content-based image analysis for medical document retrieval
Journal of the American Society for Information Science and Technology
A new framework for feature descriptor based on SIFT
Pattern Recognition Letters
How are we searching the World Wide Web? A comparison of nine search engine transaction logs
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
PCA-SIFT: a more distinctive representation for local image descriptors
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Context-based multiscale classification of document images using wavelet coefficient distributions
IEEE Transactions on Image Processing
Scene-based image retrieval by transitive matching
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Hi-index | 0.00 |
This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted series of experiments using cosine distance as our similarity measure, selecting varying numbers local interest points per page, and varying numbers of nearest neighbour points in the similarity calculations. We have found that a distance-based measure of similarity outperforms a rank-based measure except when there are few interest points. We show that using visual features substantially outperforms text-based approaches for noisy text, giving average precision in the range 0.4--0.43 in several experiments retrieving scientific papers.