NLP-Based Curation of Bacterial Regulatory Networks

  • Authors:
  • Carlos Rodríguez-Penagos;Heladia Salgado;Irma Martínez-Flores;Julio Collado-Vides

  • Affiliations:
  • Programa de Genómica Computacional, Centro de Ciencias Genómicas. Universidad Nacional Autónoma de México, Apdo. Postal 565-A, Avenida Universidad, Cuernavaca, Morelos, 62100, ...;Programa de Genómica Computacional, Centro de Ciencias Genómicas. Universidad Nacional Autónoma de México, Apdo. Postal 565-A, Avenida Universidad, Cuernavaca, Morelos, 62100, ...;Programa de Genómica Computacional, Centro de Ciencias Genómicas. Universidad Nacional Autónoma de México, Apdo. Postal 565-A, Avenida Universidad, Cuernavaca, Morelos, 62100, ...;Programa de Genómica Computacional, Centro de Ciencias Genómicas. Universidad Nacional Autónoma de México, Apdo. Postal 565-A, Avenida Universidad, Cuernavaca, Morelos, 62100, ...

  • Venue:
  • CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Manual curation of biological databases is an expensive and labor-intensive process in Genomics and Systems Biology. We report the implem-entation of a state-of-the-art, rule-based Natural Language Processing system that creates computer-readable networks of regulatory interactions directly from abstracts and full-text papers. We evaluate its output against a manually-curated standard database, and test the possibilities and limitations of automatic and semi-automatic curation of the so-called biobibliome. We also propose a novel Regulatory Interaction Mining Markup Language suited for representing this data, useful both for biologists and for text-mining specialists.