Semantic text mining for lignocellulose research

  • Authors:
  • Marie-Jean Meurs;Caitlin Murphy;Ingo Morgenstern;Nona Naderi;Greg Butler;Justin Powlowski;Adrian Tsang;René Witte

  • Affiliations:
  • Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada;Concordia University, Montréal, PQ, Canada

  • Venue:
  • Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Semantic technologies, including natural language processing (NLP), ontologies, semantic web services and web-based collaboration tools, promise to support users in dealing with complex data, thereby facilitating knowledge-intensive tasks. An ongoing challenge is to select the appropriate technologies and combine them in a coherent system that brings measurable improvements to the users. We present our ongoing development of a semantic infrastructure in support of genomics-based lignocellulose research. Part of this effort is the automated curation of knowledge from information on enzymes from fungi that is available in the literature and genome resources. Fungi naturally break down lignocellulose, hence the identification and characterization of the enzymes that they use in lignocellulose hydrolysis is an important part in research and development of biomass-derived products and fuels. Working close to the biology researchers who manually curate the existing literature, we developed ontological NLP pipelines integrated in a Web-based interface to help them in two main tasks: mining the literature for relevant information, and at the same time providing rich and semantically linked information.