Semantic annotation of biomedical literature using google

Authors:
Rune Sætre;Amund Tveit;Tonje S. Steigedal;Astrid Lægreid
Affiliations:
Department of Computer and Information Science, Norwegian University of Science, and Technology, Trondheim, Norway;Department of Computer and Information Science, Norwegian University of Science, and Technology, Trondheim, Norway;Department of Cancer Research and Molecular Medicine, Norwegian University of Science, and Technology, Trondheim, Norway;Department of Cancer Research and Molecular Medicine, Norwegian University of Science, and Technology, Trondheim, Norway
Venue:
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Year:
2005

Citing 13
Cited 2

Information extraction

Communications of the ACM
Information retrieval on the semantic web

Proceedings of the eleventh international conference on Information and knowledge management
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation

WWW '03 Proceedings of the 12th international conference on World Wide Web
Probabilistic term variant generator for biomedical terms

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatically identifying gene/protein terms in MEDLINE abstracts

Journal of Biomedical Informatics
A fuzzy ontology for medical document retrieval

ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

The Journal of Machine Learning Research
Learning by googling

ACM SIGKDD Explorations Newsletter
Enhancing a biomedical information extraction system with dictionary mining and context disambiguation

IBM Journal of Research and Development
Unsupervised named-entity extraction from the web: an experimental study

Artificial Intelligence
ProtChew: Automatic Extraction of Protein Names from Biomedical Literature

ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
Comparative experiments on learning information extractors for proteins and their interactions

Artificial Intelligence in Medicine

NLTK: the natural language toolkit

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
GeneTUC, GENIA and google: natural language understanding in molecular biology literature

Transactions on Computational Systems Biology V

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing amount of biomedical literature, there is a need for automatic extraction of information to support biomedical researchers. Due to incomplete biomedical information databases, the extraction is not straightforward using dictionaries, and several approaches using contextual rules and machine learning have previously been proposed. Our work is inspired by the previous approaches, but is novel in the sense that it is using Google for semantic annotation of the biomedical words. The semantic annotation accuracy obtained – 52% on words not found in the Brown Corpus, Swiss-Prot or LocusLink (accessed using Gsearch.org) – is justifying further work in this direction.