Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Journal of Biomedical Informatics
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.00 |
As more health data becomes available, information extraction aims to make an impact on the workflows of hospitals and care centers. One of the targeted areas is the management of pathology reports, which are employed for cancer diagnosis and staging. In this work we integrate text mining tools in the workflow of the Royal Melbourne Hospital, to extract information from pathology reports with minimal expert intervention. Our framework relies on coarse-grained annotation (at document level), making it highly portable. Our evaluation shows that the kind of language used in these reports makes it feasible to extract information with high precision and recall, by means of state-of-the-art classification methods, and feature engineering.