X-tract: Structure Extraction from Botanical Textual Descriptions
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions
Journal of the American Society for Information Science and Technology
Extracting semantic annotations from legal texts
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Semantic annotation of biosystematics literature without training examples
Journal of the American Society for Information Science and Technology
Meta-metadata: a metadata semantics language for collection representation applications
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
This paper reports the further development of machine learning techniques for semantic markup of biodiversity literature, especially morphological descriptions of living organisms such as those hosted at efloras.org and algaebase.org. Syntactic parsing and supervised machine learning techniques have been explored by earlier research. Limitations of these techniques promoted our investigation of an unsupervised learning approach that combines the strength of earlier techniques and avoids the limitations. Semantic markup at the organ and character levels is discussed. Research on semantic markup of natural heritage literature has direct impact on the development of semantic-based access in biodiversity digital libraries.