Modelling knowledge strategy for solving the DNA sequence annotation problem through CommonKADS methodology

  • Authors:
  • Daniela Xavier;Federico MoráN;RubéN Fuentes-FernáNdez;Gonzalo Pajares

  • Affiliations:
  • Department of Biochemistry and Molecular Biology I, Universidad Complutense de Madrid, Avd. Complutense s/n, 28040 Madrid, Spain;Department of Biochemistry and Molecular Biology I, Universidad Complutense de Madrid, Avd. Complutense s/n, 28040 Madrid, Spain;Department of Software Engineering and Artificial Intelligence, Universidad Complutense de Madrid, C/Profesor José García Santesmases s/n, 28040 Madrid, Spain;Department of Software Engineering and Artificial Intelligence, Universidad Complutense de Madrid, C/Profesor José García Santesmases s/n, 28040 Madrid, Spain

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2013

Quantified Score

Hi-index 12.05

Visualization

Abstract

Finding the genes that exist within a DNA sequence and assigning them biological features and functions is one of the biggest challenges of Genomics. This task, called annotation, has to be as accurate and reliable as possible, because this information will be applied in other researches. Ideally, each sequence should be annotated and validated by a human expert, who has the knowledge to infer the most appropriate annotation. Nevertheless, the huge amount of genomic data produced by the new sequencing technologies prevents this practice. Developing expert systems that are able to annotate sequences automatically and emulate the expert involvement in certain key points of the process would enhance the annotation quality. In this work, the CommonKADS methodology is innovatively applied for this purpose. It is used to structure and model the knowledge required to build an expert system able to deal with the functional part of sequence annotation, i.e. establishing the biological purpose of the sequence. This approach provides the first general framework for the aforementioned problem, which can be easily extended to related issues.