Using bags of symbols for automatic indexing of graphical document image databases

Authors:
Eugen Barbu;Pierre Héroux;Sébastien Adam;Éric Trupin
Affiliations:
LITIS, Université de Rouen, Saint-Etienne du Rouvray, France;LITIS, Université de Rouen, Saint-Etienne du Rouvray, France;LITIS, Université de Rouen, Saint-Etienne du Rouvray, France;LITIS, Université de Rouen, Saint-Etienne du Rouvray, France
Venue:
GREC'05 Proceedings of the 6th international conference on Graphics Recognition: ten Years Review and Future Perspectives
Year:
2005

Citing 8
Cited 2

Invariant Image Recognition by Zernike Moments

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Twenty Years of Document Image Analysis in PAMI

IEEE Transactions on Pattern Analysis and Machine Intelligence
Algorithms for Graphics and Imag

Algorithms for Graphics and Imag
Fine-Grained Document Genre Classification Using First Order Random Graphs

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
State of the art of graph-based data mining

ACM SIGKDD Explorations Newsletter
An Efficient Algorithm for Discovering Frequent Subgraphs

IEEE Transactions on Knowledge and Data Engineering

Efficient retrieval of 3D building models using embeddings of attributed subgraphs

Proceedings of the 20th ACM international conference on Information and knowledge management
An integer linear program for substitution-tolerant subgraph isomorphism and its use for symbol spotting in technical drawings

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

A database is only usefull if it is associated a set of procedures allowing to retrieve relevant elements for the users’ needs. A lot of IR techniques have been developed for automatic indexing and retrieval in document databases. Most of these use indexes depending on the textual content of documents, and very few are able to handle graphical or image content without human annotation. This paper describes an approach similar to the bag of words technique for automatic indexing of graphical document image databases and different ways to consequently query these databases. In an unsupervised manner, this approach proposes a set of automatically discovered symbols that can be combined with logical operators to build queries.