ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Table Recognition and Understanding from PDF Files
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Table extraction using spatial reasoning on the CSS2 visual box model
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
Converting PDF document to HTML document with the same layout format is a very important and interesting research problem. After the conversion, it is easy for PDF document to be browsed online and information extracted. Based on the extraction result of the PDF document of the open source tool PDFBox, the paper described a method that can detect the layout information of the PDF document and convert the PDF document to HTML page effectively.