Communications of the ACM
Natural Language Processing for PROLOG Programmers
Natural Language Processing for PROLOG Programmers
Building deep dependency structures with a wide-coverage CCG parser
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ProtChew: Automatic Extraction of Protein Names from Biomedical Literature
ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
Large-scale induction and evaluation of lexical resources from the Penn-II treebank
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Deep linguistic analysis for the accurate identification of predicate-argument relations
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Evaluating and integrating treebank parsers on a biomedical corpus
Software '05 Proceedings of the Workshop on Software
Comparative experiments on learning information extractors for proteins and their interactions
Artificial Intelligence in Medicine
gProt: annotating protein interactions using google and gene ontology
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Adapting a probabilistic disambiguation model of an HPSG parser to a new domain
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Semantic annotation of biomedical literature using google
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Hi-index | 0.00 |
With the increasing amount of biomedical literature, there is a need for automatic extraction of information to support biomedical researchers. GeneTUC has been developed to be able to read biological texts and answer questions about them afterwards. The knowledge base of the system is constructed by parsing MEDLINE abstracts or other online text strings retrieved by the Google API. When the system encounters words that are not in the dictionary, the Google API can be used to automatically determine the semantic class of the word and add it to the dictionary. The performance of the GeneTUC parser was tested and compared to the manually tagged GENIA corpus with EvalB, giving bracketing precision and recall scores of 70,6% and 53,9% respectively. GeneTUC was able to parse 60,2% of the sentences, and the POS-tagging accuracy was 86.0%. This is not as high as the best taggers and parsers available, but GeneTUC is also capable of doing deep reasoning, like anaphora resolution and question answering, which is not a part of the state-of-the-art parsers.