Mapping equivalence for symbolic sequences: theory and applications

  • Authors:
  • Liming Wang;Dan Schonfeld

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL;Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL

  • Venue:
  • IEEE Transactions on Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 35.68

Visualization

Abstract

Processing of symbolic sequences represented by mapping of symbolic data into numerical signals is commonly used in various applications. It is a particularly popular approach in genomic and proteomic sequence analysis. Numerous mappings of symbolic sequences have been proposed for various applications. It is unclear however whether the processing of symbolic data provides an artifact of the numerical mapping or is an inherent property of the symbolic data. This issue has been long ignored in the engineering and scientific literature. It is possible that many of the results obtained in symbolic signal processing could be a byproduct of the mapping and might not shed any light on the underlying properties embedded in the data. Moreover, in many applications, conflicting conclusions may arise due to the choice of the mapping used for numerical representation of symbolic data. In this paper, we present a novel framework for the analysis of the equivalence of the mappings used for numerical representation of symbolic data. We present strong and weak equivalence properties and rely on signal correlation to characterize equivalent mappings. We derive theoretical results which establish conditions for consistency among numerical mappings of symbolic data. Furthermore, we introduce an abstract mapping model for symbolic sequences and extend the notion of equivalence to an algebraic framework. Finally, we illustrate our theoretical results by application to DNA sequence analysis.