Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An approach to automatic classification of text for information retrieval
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions
Journal of the American Society for Information Science and Technology
Unsupervised semantic markup of literature for biodiversity digital libraries
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
An Unsupervised Approach to Product Attribute Extraction
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Semi-supervised learning of attribute-value pairs from product descriptions
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Tools for semantic annotation of taxonomic descriptions
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part IV
Unsupervised extraction of text segments from heterogeneous document collections
Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
From text to RDF triple store: an application for biodiversity literature
Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
CharaParser for fine-grained semantic annotation of organism morphological descriptions
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
This article presents an unsupervised algorithm for semantic annotation of morphological descriptions of whole organisms. The algorithm is able to annotate plain text descriptions with high accuracy at the clause level by exploiting the corpus itself. In other words, the algorithm does not need lexicons, syntactic parsers, training examples, or annotation templates. The evaluation on two real-life description collections in botany and paleontology shows that the algorithm has the following desirable features: (a) reduces-eliminates manual labor required to compile dictionaries and prepare source documents; (b) improves annotation coverage: the algorithm annotates what appears in documents and is not limited by predefined and often incomplete templates; (c) learns clean and reusable concepts: the algorithm learns organ names and character states that can be used to construct reusable domain lexicons, as opposed to collection-dependent patterns whose applicability is often limited to a particular collection; (d) insensitive to collection size; and (e) runs in linear time with respect to the number of clauses to be annotated. © 2010 Wiley Periodicals, Inc.