Computing Knowledge-Based Semantic Similarity from the Web: An Application to the Biomedical Domain

Authors:
David Sánchez;Montserrat Batet;Aida Valls
Affiliations:
Department of Computer Science and Mathematics, University Rovira i Virgili, Tarragona 43007;Department of Computer Science and Mathematics, University Rovira i Virgili, Tarragona 43007;Department of Computer Science and Mathematics, University Rovira i Virgili, Tarragona 43007
Venue:
KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Year:
2009

Citing 13
Cited 2

Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Towards the development of a conceptual distance metric for the UMLS

Journal of Biomedical Informatics
A semantic concordance

HLT '93 Proceedings of the workshop on Human Language Technology
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications

Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Measures of semantic similarity and relatedness in the biomedical domain

Journal of Biomedical Informatics
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
Unsupervised Semantic Similarity Computation usingWeb Search Engines

WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Learning non-taxonomic relationships from web documents for domain ontology construction

Data & Knowledge Engineering
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Processing natural language without natural language processing

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing

Ontology-driven web-based semantic similarity

Journal of Intelligent Information Systems
An evaluation of corpus-driven measures of medical concept similarity for information retrieval

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computation of semantic similarity between concepts is a very common problem in many language related tasks and knowledge domains. In the biomedical field, several approaches have been developed to deal with this issue by exploiting the knowledge available in domain ontologies (SNOMED-CT) and specific, closed and reliable corpuses (clinical data). However, in recent years, the enormous growth of the Web has motivated researchers to start using it as the base corpus to assist semantic analysis of language. This paper proposes and evaluates the use of the Web as background corpus for measuring the similarity of biomedical concepts. Several classical similarity measures have been considered and tested, using a benchmark composed by biomedical terms and comparing the results against approaches in which specific clinical data were used. Results shows that the similarity values obtained from the Web are even more reliable than those obtained from specific clinical data, manifesting the suitability of the Web as an information corpus for the biomedical domain.