Can back-of-the-book indexes be automatically created?

  • Authors:
  • Zhaohui Wu;Zhenhui Li;Prasenjit Mitra;C. Lee Giles

  • Affiliations:
  • Computer Science and Engineering, Pennsylvania State University, State College, PA, USA;Information Sciences and Technology, Pennsylvania State University, State College, USA;Information Sciences and Technology, Pennsylvania State University, State College, USA;Information Sciences and Technology, Pennsylvania State University, State College, USA

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic creation of back-of-the-book indexes remains one of the few manual tasks related to publishing. Inspired by how human indexers work on back-of-the-book indexes creation, we present a new domain-independent, corpus-free and training-free automation approach. Given a book, the index terms will be sequentially selected according to an indexability score encoded by the structure information residing in a book as well as a novel context-aware term informativeness measurement utilizing the power of the web knowledge base such as Wikipedia. By extensive experiments on books from various domains, we show our approach to be a more effective and practical than ones that used previous keyword extraction and supervised learning.