Annual review of information science and technology, vol. 22
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Information retrieval
Information retrieval
International Journal of Computer Vision
Knowledge-Directed Interpretation of Mechanical Engineering Drawings
IEEE Transactions on Pattern Analysis and Machine Intelligence
A word shape analysis approach to lexicon based word recognition
Pattern Recognition Letters
Intelligent forms processing system
Machine Vision and Applications - Special issue: document image analysis techniques
Query expansion using lexical-semantic relations
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Incorporation of a Markov model of language syntax in a text recognition algorithm
Document image analysis
TREC-2 Proceedings of the second conference on Text retrieval conference
Indexing handwriting using word matching
Proceedings of the first ACM international conference on Digital libraries
Font and function word identification in document recognition
Computer Vision and Image Understanding
Periodicity, Directionality, and Randomness: Wold Features for Image Modeling and Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
MARCO: MAp Retrieval by COntent
IEEE Transactions on Pattern Analysis and Machine Intelligence
Texture Features for Browsing and Retrieval of Image Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in multilingual information retrieval using the SPIDER system
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Applying algebraic and differential invariants for logo recognition
Machine Vision and Applications
Phrasal translation and query expansion techniques for cross-language information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Word spotting: indexing handwritten manuscripts
Intelligent multimedia information retrieval
NETRA: a toolbox for navigating large image databases
NETRA: a toolbox for navigating large image databases
Resolving ambiguity for cross-language retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Summarization of imaged documents without OCR
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
Comparing images using joint histograms
Multimedia Systems - Special issue on video content based retrieval
A vector space model for automatic indexing
Communications of the ACM
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Keyword Spotting in Poorly Printed Documents using Pseudo 2-D Hidden Markov Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Logo and Word Matching Using a General Approach to Signal Registration
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Document image similarity and equivalence detection
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Using Character Shape Coding for Information Retrieval
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
An Approximate String Match for Garbled Text with Various Accuracy
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Retrieval methods for English-text with missrecognized OCR characters
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Document image database retrieval and browsing using texture analysis
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Image Categorization Using Texture Features
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Measuring the Effects of OCR Errors on Similarity Linking
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
The Retrieval of Document Images: A Brief Survey
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
The Detection of Duplicates in Document Image Databases
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Probabilistic Retrieval of OCR Degraded Text Using N-Grams
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Cross-Language Information Retrieval in a Multilingual Legal Domain
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Extraction of Indicative Summary Sentences from Imaged Documents
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Models and algorithms for efficient color image indexing
CAIVL '97 Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '97)
Image Indexing Using Color Correlograms
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Word Spotting: A New Approach to Indexing Handwriting
CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Structural Compression for Documents Analysis
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Clustering OCR-ed texts for browsing document image database
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Dienst: Implementation Reference Manual
Dienst: Implementation Reference Manual
Experiments in Multi-Lingual Information Retrieval
Experiments in Multi-Lingual Information Retrieval
Adaptive vector space text filtering for monolingual and cross-language application
Adaptive vector space text filtering for monolingual and cross-language application
Color-spatial image indexing and applications
Color-spatial image indexing and applications
A blueprint for automatic indexing
ACM SIGIR Forum
The Text REtrieval Conferences (TRECs)
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
The text retrieval conferences (TRECS)
TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Retrieval by Layout Similarity of Documents Represented with MXY Trees
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
New Challenges for Cross-Language Information Retrieval: Multimedia Data and the User Experience
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Information Retrieval in Document Image Databases
IEEE Transactions on Knowledge and Data Engineering
Identification of common methods used for ontology integration tasks
Proceedings of the first international workshop on Interoperability of heterogeneous information systems
Hangul Document Image Retrieval System Using Rank-based Recognitio
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
An Approach for Stemming in Symbolically Compressed Indian Language Imaged Documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Font Adaptive Word Indexing of Modern Printed Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Retrieval of machine-printed Latin documents through Word Shape Coding
Pattern Recognition
Retrieval of machine-printed Latin documents through Word Shape Coding
Pattern Recognition
A survey and classification of semantic search approaches
International Journal of Metadata, Semantics and Ontologies
Effectively Searching Maps in Web Documents
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Scalable indexing for layout based document retrieval and ranking
Proceedings of the 2010 ACM Symposium on Applied Computing
A kernel-based approach to document retrieval
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Decomposing background topics from keywords by principal component pursuit
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A method for user profile adaptation in document retrieval
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part II
Comparative information retrieval evaluation for scanned documents
Proceedings of the 15th WSEAS international conference on Computers
Event log mining tool for large scale HPC systems
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Improved stable retrieval in noisy collections
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
Efficient word retrieval by means of SOM clustering and PCA
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Exploring digital libraries with document image retrieval
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Keyword spotting in unconstrained handwritten Chinese documents using contextual word model
Image and Vision Computing
Hi-index | 0.00 |
Given the phenomenal growth in the variety and quantity of data available to users through electronic media, there is a great demand for efficient and effective ways to organize and search through all this information. Besides speech, our principal means of communication is through visual media, and in particular, through documents. In this paper, we provide an update on Doermann's comprehensive survey (1998) of research results in the broad area of document-based information retrieval. The scope of this survey is also somewhat broader, and there is a greater emphasis on relating document image analysis methods to conventional IR methods.Documents are available in a wide variety of formats. Technical papers are often available as ASCII files of clean, correct, text. Other documents may only be available as hardcopies. These documents have to be scanned and stored as images so that they may be processed by a computer. The textual content of these documents may also be extracted and recognized using OCR methods. Our survey covers the broad spectrum of methods that are required to handle different formats like text and images. The core of the paper focuses on methods that manipulate document images directly, and perform various information processing tasks such as retrieval, categorization, and summarization, without attempting to completely recognize the textual content of the document. We start, however, with a brief overview of traditional IR techniques that operate on clean text. We also discuss research dealing with text that is generated by running OCR on document images. Finally, we also briefly touch on the related problem of content-based image retrieval.