Object-level document analysis of PDF files
Proceedings of the 9th ACM symposium on Document engineering
An open approach towards the benchmarking of table structure recognition systems
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
A system for converting PDF documents into structured XML format
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
This paper presents an object-based method for analysing the content drawn by graphical operators in natively digital PDF documents. We propose that graphical content in a document can be classified either as structural or non-structural and present an output model for our analysis result. Heuristic techniques are used to group the instructions into regions and determine their logical role in the document's structure. Experimental results demonstrate the effectiveness of the algorithm.