An efficient pattern matching algorithm for comparative Genome sequence analysis

  • Authors:
  • Muneer Ahmad;Hassan Mathkour

  • Affiliations:
  • Department of Computer Science, College of Computer & Information Sciences, King Saud University, Saudi Arabia;Department of Computer Science, College of Computer & Information Sciences, King Saud University, Saudi Arabia

  • Venue:
  • ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequences, meant as logic units of meaningful term successions, can be considered the backbone of data. Consider, for instance, genetic sequences, where the terms are genetic symbols, or plain natural language sentences, formed by words. To name just few examples of sequence use, consider the adoption of sentences for the description of the real world modeled in the database and their role in composing documents. Searching in sequence repositories often requires going beyond exact matching to determine the sequences which are similar or close to a given query sentence (approximate matching). The similarity involved in this process can be based either on the semantics of the sequence or just on its syntax. The former considers the meaning of the terms in the sequences, and is almost impossible to elaborate the results before the proper extraction and analysis while the later approach is sufficiently comprehensive at implementation level. It finds the number of approximate matches of the sequences for optimal results.