Enhancing semantic relation quality of UMLS knowledge sources

Authors:
Demeke Ayele;Jean-Pierre Chevallet;Getnet Kassie;Million Meshesha
Affiliations:
Addis Ababa University, Addis Ababa, Ethiopia;University of Grenoble, France Grenoble, France;Addis Ababa University, Addis Ababa, Ethiopia;Addis Ababa University, Addis Ababa, Ethiopia
Venue:
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Year:
2012

Citing 6
Cited 0

Consistency across the hierarchies of the UMLS semantic network and metathesaurus

Journal of Biomedical Informatics - Special issue: Unified medical language system
Enrichment of OBO ontologies

Journal of Biomedical Informatics
A review of auditing methods applied to the content of controlled biomedical terminologies

Journal of Biomedical Informatics
Auditing associative relations across two knowledge sources

Journal of Biomedical Informatics
The Neighborhood Auditing Tool: A hybrid interface for auditing the UMLS

Journal of Biomedical Informatics
Semantic Predications for Complex Information Needs in Biomedical Literature

BIBM '11 Proceedings of the 2011 IEEE International Conference on Bioinformatics and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

The quality of semantic tuples (semantic triples forming subject-predicate-object) has significant impact in most text mining and knowledge discovery applications. The practical success and usability of these applications momentously depends on the quality of the extracted semantic triples. Most biomedical semantic resources have been developed for different contexts focusing on the structural representation but with less attention on the acceptability and naturalness of the individual semantic triples. In this article, we presented an integrated approach for enhancing the quality of semantic tuples in the UMLS knowledge sources. The approach is based on the integration of three existing auditing techniques: avoiding redundant classifications of semantic concepts, reducing hierarchical and associative relationship inconsistencies. We evaluated the approach based on the number of identified wrongly assigned concepts and inconsistent relationships obtained. The quality of each semantic triple is evaluated based on the acceptability and naturalness of the semantic tuples. The evaluation shows promising results. In the evaluation, we have extracted 10,082 semantic triples randomly from UMLS and obtained 5646 taxonomically and 4436 non-taxonomically related semantic triples. 826 concepts are found redundantly classified and 352 are found hierarchically inconsistent. In non-taxonomic semantic triples, out of 4436, 726 are found to be inconsistent. The quality (acceptability and naturalness) of each semantic triples of the first 100 are also evaluated using domain experts. The Cohen's kappa coefficient is used to measure the degree of agreement between the annotators and the result is promising (0.8).