Challenges in generating bookmarks from TOC entries in e-books

  • Authors:
  • Yogalakshmi Jayabal;Chandrashekar Ramanathan;Mehul Jayprakash Sheth

  • Affiliations:
  • International Institute of Information Technology, Bangalore, India;International Institute of Information Technology, Bangalore, India;International Institute of Information Technology, Bangalore, India

  • Venue:
  • Proceedings of the 2012 ACM symposium on Document engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

ABSTRACT The task of extracting document structures from a digital e-book is difficult and is an active area of research. On the other hand, many e-books already have a table of contents (TOC) at the beginning of the document. This may lead us to believe that adding bookmarks into digital document (e-book) based on the existing TOC would be trivial. In this paper, we highlight the challenges involved in this task of automatically adding bookmarks to an existing e-book based on the TOC that exists within the document. If we are able to reliably identify the specific locations of each TOC entry within the document, the algorithms can be easily extended to identify document structures within e-books that have TOC. We describe a tool we have built called Booky that tries to add automatic PDF bookmarks to existing PDF based e-books as they have TOC as part of the document content. The tool addresses most of the challenges that have been identified while still leaving a few tricky scenarios still open.