User-driven development of text mining resources for cancer risk assessment

  • Authors:
  • Lin Sun;Anna Korhonen;Ilona Silins;Ulla Stenius

  • Affiliations:
  • University of Cambridge, Cambridge, UK;University of Cambridge, Cambridge, UK;Institute of Environmental Medicine, Stockholm, Sweden;Institute of Environmental Medicine, Stockholm, Sweden

  • Venue:
  • BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most neglected areas of biomedical Text Mining (TM) is the development of systems based on carefully assessed user needs. We investigate the needs of an important task yet to be tackled by TM --- Cancer Risk Assessment (CRA) --- and take the first step towards the development of TM for the task: identifying and organizing the scientific evidence required for CRA in a taxonomy. The taxonomy is based on expert annotation of 1297 MEDLINE abstracts. We report promising results with inter-annotator agreement tests and automatic classification experiments, and a user test which demonstrates that the resources we have built are well-defined, accurate, and applicable to a real-world CRA scenario. We discuss extending and refining the taxonomy further via manual and machine learning approaches, and the subsequent steps required to develop TM for the needs of CRA.