E-Mail Signature Block Analysis

  • Authors:
  • Affiliations:
  • Venue:
  • ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The signature block is a common structured component found in e-mail messages. Accurate identification and analysis of signature blocks are important in many multimedia messaging and information retrieval applications such as email text-to-speech rendering. It is also a very challenging task, because signature blocks often appear in complex two-dimensional layouts which are guided only by loose conventions. Traditional text analysis methods designed to deal with sequential text cannot handle 2-dimensional structures, while the highly unconstrained nature of signature blocks makes the application of 2-dimensional grammars very difficult. In this paper we describe an algorithm for signature block analysis which combines two-dimensional structural segmentation with one-dimensional grammatical constraints. The information obtained from both geometrical and linguistic analysis are integrated in the form of weighted finite state transducers (WFST), and the final solution is the optimal interpretation under both constraints.