High accuracy and language independent document retrieval with a fast invariant transform

Authors:
Qiong Liu;Hironori Yano;Don Kimber;Chunyuan Liao;Lynn Wilcox
Affiliations:
FX Palo Alto Laboratory, Palo Alto, CA;Internet Service Department, Fujifilm Corporation, Tokyo, Japan;FX Palo Alto Laboratory, Palo Alto, CA;FX Palo Alto Laboratory, Palo Alto, CA;FX Palo Alto Laboratory, Palo Alto, CA
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 5
Cited 8

Approximate nearest neighbor queries in fixed dimensions

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Speeded-Up Robust Features (SURF)

Computer Vision and Image Understanding
HOTPAPER: multimedia interaction with paper using mobile phones

MM '08 Proceedings of the 16th ACM international conference on Multimedia
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

PACER: toward a cameraphone-based paper interface for fine-grained and flexible interaction with documents

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Embedded media markers: marks on paper that signify associated media

Proceedings of the 15th international conference on Intelligent user interfaces
Pacer: fine-grained interactive paper via camera-touch hybrid gestures on a cell phone

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
FACT: fine-grained cross-media interaction with documents via a portable hybrid paper-laptop interface

Proceedings of the international conference on Multimedia
Embedded media marker: linking multimedia to paper

Proceedings of the international conference on Multimedia
Embedded media barcode links: optimally blended barcode overlay on paper for linking to associated media

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Discrete point based signatures and applications to document matching

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing: Part I
PaperUI

CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a tool and a novel Fast Invariant Transform (FIT) algorithm for language independent e-documents access. The tool enables a person to access an e-document through an informal camera capture of a document hardcopy. It can save people from remembering/exploring numerous directories and file names, or even going through many pages/paragraphs in one document. It can also facilitate people's manipulation of a document or people's interactions through documents. Additionally, the algorithm is useful for binding multimedia data to language independent paper documents. Our document recognition algorithm is inspired by the widely known SIFT descriptor [4] but can be computed much more efficiently for both descriptor construction and search. It also uses much less storage space than the SIFT approach. By testing our algorithm with randomly scaled and rotated document pages, we can achieve a 99.73% page recognition rate on the 2188-page ICME06 proceedings and 99.9% page recognition rate on a 504-page Japanese math book [2].