Multimodal alignment of scholarly documents and their presentations

  • Authors:
  • Bamdad Bahrani;Min-Yen Kan

  • Affiliations:
  • National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore

  • Venue:
  • Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a multimodal system for aligning scholarly docu- ments to corresponding presentations in a fine-grained man- ner (i.e., per presentation slide and per paper section). Our method improves upon a state-of-the-art baseline that em- ploys only textual similarity. Based on an analysis of base- line errors, we propose a three-pronged alignment system that combines textual, image, and ordering information to establish alignment. Our results show a statistically sig- nificant improvement of 25%, confirming the importance of visual content in improving alignment accuracy.