Case-Based Reasoning for Invoice Analysis and Recognition
ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Information extraction by finding repeated structure
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Open world classification of printed invoices
Proceedings of the 10th ACM symposium on Document engineering
Hi-index | 0.00 |
In this paper a morphological tagging approach for document image invoice analysis is described. Tokens close by their morphology and confirmed in their location within different similar contexts make apparent some parts of speech representative of the structure elements. This bottom up approach avoids the use of an priori knowledge provided that there are redundant and frequent contexts in the text. The approach is applied on the invoice body text roughly recognized by OCR and automatically segmented. The method makes possible the detection of the invoice articles and their different fields. The regularity of the article composition and its redundancy in the invoice is a good help for its structure. The recognition rate of 276 invoices and 1704 articles, is over than 91.02% for articles and 92.56% for fields.