Efficient Dilation, Erosion, Opening, and Closing Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
Books with voices: paper transcripts as a physical interface to oral histories
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
The video paper multimedia playback system
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
ButterflyNet: a mobile capture and access system for field biology research
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Mobile camera-based adaptive viewing
MUM '05 Proceedings of the 4th international conference on Mobile and ubiquitous multimedia
Camera phone based motion sensing: interaction techniques, applications and performance study
UIST '06 Proceedings of the 19th annual ACM symposium on User interface software and technology
Mobile camera supported document redirection
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
ICAT '07 Proceedings of the 17th International Conference on Artificial Reality and Telexistence
Efficient Extraction of Robust Image Features on Mobile Devices
ISMAR '07 Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality
Experiences with Handheld Augmented Reality
ISMAR '07 Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
HOTPAPER demonstration: multimedia interaction with paper using mobile phones
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Mobile media search: has media search finally found its perfect platform? part II
MM '09 Proceedings of the 17th ACM international conference on Multimedia
MM '09 Proceedings of the 17th ACM international conference on Multimedia
High accuracy and language independent document retrieval with a fast invariant transform
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Mobile web browsing initiated by visual search
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Embedded media markers: marks on paper that signify associated media
Proceedings of the 15th international conference on Intelligent user interfaces
Mobile image recognition: architectures and tradeoffs
Proceedings of the Eleventh Workshop on Mobile Computing Systems & Applications
Pacer: fine-grained interactive paper via camera-touch hybrid gestures on a cell phone
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Designing a CD augmentation for mobile phones
CHI '10 Extended Abstracts on Human Factors in Computing Systems
PaperComp 2010: first international workshop on paper computing
Proceedings of the 12th ACM international conference adjunct papers on Ubiquitous computing - Adjunct
Proceedings of the international conference on Multimedia
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
The stanford mobile visual search data set
MMSys '11 Proceedings of the second annual ACM conference on Multimedia systems
Document area identification for extending books without markers
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Large-scale EMM identification based on geometry-constrained visual word correspondence voting
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Who's that girl? handheld augmented reality for printed photo books
INTERACT'11 Proceedings of the 13th IFIP TC 13 international conference on Human-computer interaction - Volume Part III
A tool for authoring unambiguous links from printed content to digital media
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Minimum correspondence sets for improving large-scale augmented paper
Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry
Compressed Histogram of Gradients: A Low-Bitrate Descriptor
International Journal of Computer Vision
Evaluating and understanding the usability of a pen-based command system for interactive paper
ACM Transactions on Computer-Human Interaction (TOCHI)
CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
A survey on smartphone-based systems for opportunistic user context recognition
ACM Computing Surveys (CSUR)
Annotate me: supporting active reading using real-time document image retrieval on mobile devices
Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication
Exact and easy guidance with visual navigation situation for mobile user
Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Hi-index | 0.00 |
The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or watermarks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We developed a novel algorithm, Brick Wall Coding (BWC), that performs image-based document recognition using the mobile phone video frames. Given a document patch image, BWC utilizes the layout, i.e. relative locations, of word boxes in order to determine the original file, page, and the location on the page. BWC runs real-time (4 frames per second) on a Treo 700w smartphone with a 312 MHz processor and 64MB RAM. Using our method we can recognize blurry document patch frames that contain as little as 4-5 lines of text and a video resolution as low as 176x144. We performed experiments by indexing 4397 document pages and querying this database with 533 document patches. Besides describing the basic algorithm, this paper also describes several applications that are enabled by mobile phone-paper interaction, such as inserting electronic annotations to paper, using paper as a tangible interface to collect and communicate multimedia data, and collaborative homework.