A new class of learnable detectors for categorisation

Authors:
Jiri Matas;Karel Zimmermann
Affiliations:
Center for Machine Perception, Faculty of Electrotechnical Engineering, Czech Technical University in Prague;Center for Machine Perception, Faculty of Electrotechnical Engineering, Czech Technical University in Prague
Venue:
SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
Year:
2005

Citing 9
Cited 3

Local Grayvalue Invariants for Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Saliency, Scale and Image Description

International Journal of Computer Vision
Unsupervised Learning of Models for Recognition

ECCV '00 Proceedings of the 6th European Conference on Computer Vision-Part I
Content-Based Image Retrieval Based on Local Affinely Invariant Regions

VISUAL '99 Proceedings of the Third International Conference on Visual Information and Information Systems
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Affine-Invariant Local Descriptors and Neighborhood Statistics for Texture Recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Wide-baseline multiple-view correspondences

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition

Context information from search engines for document recognition

Pattern Recognition Letters
Detecting, tracking and recognizing license plates

ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part II
A method for text localization and recognition in real-world images

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

A new class of image-level detectors that can be adapted by machine learning techniques to detect parts of objects from a given category is proposed. A classifier (e.g. neural network or adaboost trained classifier) within the detector selects a relevant subset of extremal regions, i.e. regions that are connected components of a thresholded image. Properties of extremal regions render the detector very robust to illumination change. Robustness to viewpoint change is achieved by using invariant descriptors and/or by modeling shape variations by the classifier. The approach is brought to bear on three problems: text detection, face segmentation and leopard skin detection. High detection rates were obtained for unconstrained (i.e. brightness, affine and font invariant) text detection (92%) with a reasonable false positive rate. The time-complexity of the detection is approximately linear in the number of pixels and a non-optimized implementation runs at about 1 frame per second for a 640× 480 image on a high-end PC.