Assisted editing in the biomedical domain: motivation and challenges.

  • Authors:
  • Fabio Rinaldi

  • Affiliations:
  • University of Zurich, Zurich, Switzerland

  • Venue:
  • Proceedings of the 2013 ACM symposium on Document engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the characteristics of biomedical scientific literature is the high ambiguity of the domain-specific terminology which can be used to describe technical concepts and specific objects of the domain. This is partly due to the very broad scope of the domain of interest and partly to inherent properties of the terminology itself. There aresimply very large numbers of genes, proteins, organs, cell lines, cellular phenomena, experimental methods, and so on. For example, UniProt, the most authoritative protein database, currently contains more than 33 million entries. Clearly, the names which are typically used to refer to proteins are polysemic and might refer to hundreds of different entries in a reference database. Such a large and extensive terminology necessarily makes it difficult to derive from the literature a simplified representation of the entities and relationships described in the articles, despite considerable efforts by the text mining community. In this paper we propose to complement such efforts with editing tools that can assist the authors in efficiently adding to their publications a minimal semantic annotation so that much of the ambiguity is avoided.