Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Towards the development of a conceptual distance metric for the UMLS
Journal of Biomedical Informatics
HLT '93 Proceedings of the workshop on Human Language Technology
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Measures of semantic similarity and relatedness in the biomedical domain
Journal of Biomedical Informatics
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Unsupervised Semantic Similarity Computation usingWeb Search Engines
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Learning non-taxonomic relationships from web documents for domain ontology construction
Data & Knowledge Engineering
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Processing natural language without natural language processing
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Ontology-driven web-based semantic similarity
Journal of Intelligent Information Systems
An evaluation of corpus-driven measures of medical concept similarity for information retrieval
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Computation of semantic similarity between concepts is a very common problem in many language related tasks and knowledge domains. In the biomedical field, several approaches have been developed to deal with this issue by exploiting the knowledge available in domain ontologies (SNOMED-CT) and specific, closed and reliable corpuses (clinical data). However, in recent years, the enormous growth of the Web has motivated researchers to start using it as the base corpus to assist semantic analysis of language. This paper proposes and evaluates the use of the Web as background corpus for measuring the similarity of biomedical concepts. Several classical similarity measures have been considered and tested, using a benchmark composed by biomedical terms and comparing the results against approaches in which specific clinical data were used. Results shows that the similarity values obtained from the Web are even more reliable than those obtained from specific clinical data, manifesting the suitability of the Web as an information corpus for the biomedical domain.