Encoding biomedical resources in TEI: the case of the GENIA corpus
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
An empirical study of tokenization strategies for biomedical information retrieval
Information Retrieval
Rule-based information extraction from patients' clinical data
Journal of Biomedical Informatics
Building a semantically annotated corpus of clinical texts
Journal of Biomedical Informatics
Corpus design for biomedical natural language processing
ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Automatic semantic labeling of medical texts with feature structures
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Annotation schemes to encode domain knowledge in medical narratives
LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Hi-index | 0.00 |
The paper discuses problems in annotating a corpus containing Polish clinical data with low level linguistic information. We propose an approach to tokenization and automatic morphologic annotation of data that uses existing programs combined with a set of domain specific rules and vocabulary. Finally we present the results of manual verification of the annotation for a subset of data.