Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Swoogle: a search and metadata engine for the semantic web
Proceedings of the thirteenth ACM international conference on Information and knowledge management
HLT '93 Proceedings of the workshop on Human Language Technology
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Ontology Matching
Measures of semantic similarity and relatedness in the biomedical domain
Journal of Biomedical Informatics
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Locating complex named entities in web text
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Domain Ontology Learning from the Web
Domain Ontology Learning from the Web
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Computing Knowledge-Based Semantic Similarity from the Web: An Application to the Biomedical Domain
KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Processing natural language without natural language processing
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Ontology-enriched multi-document summarization in disaster management
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Semantic Clustering Using Multiple Ontologies
Proceedings of the 2010 conference on Artificial Intelligence Research and Development: Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence
Learning relation axioms from text: An automatic Web-based approach
Expert Systems with Applications: An International Journal
Journal of Biomedical Informatics
Ontology-based semantic similarity: A new feature-based approach
Expert Systems with Applications: An International Journal
Journal of Biomedical Informatics
The Semantic Service Search Engine (S3E)
Journal of Intelligent Information Systems
An information content based partitioning method for the anatomical ontology matching task
Proceedings of the Third Symposium on Information and Communication Technology
Semantically-grounded construction of centroids for datasets with textual attributes
Knowledge-Based Systems
Information Sciences: an International Journal
Preventing automatic user profiling in Web 2.0 applications
Knowledge-Based Systems
A semantic similarity method based on information content exploiting multiple ontologies
Expert Systems with Applications: An International Journal
Semantic similarity estimation from multiple ontologies
Applied Intelligence
Detecting sensitive information from textual documents: an information-theoretic approach
MDAI'12 Proceedings of the 9th international conference on Modeling Decisions for Artificial Intelligence
An automatic approach for ontology-based feature extraction from heterogeneous textualresources
Engineering Applications of Artificial Intelligence
A language for end-user web augmentation: Caring for producers and consumers alike
ACM Transactions on the Web (TWEB)
Semantic similarity measurement using historical google search patterns
Information Systems Frontiers
Hi-index | 0.00 |
Estimation of the degree of semantic similarity/distance between concepts is a very common problem in research areas such as natural language processing, knowledge acquisition, information retrieval or data mining. In the past, many similarity measures have been proposed, exploiting explicit knowledge--such as the structure of a taxonomy--or implicit knowledge--such as information distribution. In the former case, taxonomies and/or ontologies are used to introduce additional semantics; in the latter case, frequencies of term appearances in a corpus are considered. Classical measures based on those premises suffer from some problems: in the first case, their excessive dependency of the taxonomical/ontological structure; in the second case, the lack of semantics of a pure statistical analysis of occurrences and/or the ambiguity of estimating concept statistical distribution from term appearances. Measures based on Information Content (IC) of taxonomical concepts combine both approaches. However, they heavily depend on a properly pre-tagged and disambiguated corpus according to the ontological entities in order to compute accurate concept appearance probabilities. This limits the applicability of those measures to other ontologies ---like specific domain ontologies- and massive corpus ---like the Web-. In this paper, several of the presented issues are analyzed. Modifications of classical similarity measures are also proposed. They are based on a contextualized and scalable version of IC computation in the Web by exploiting taxonomical knowledge. The goal is to avoid the measures' dependency on the corpus pre-processing to achieve reliable results and minimize language ambiguity. Our proposals are able to outperform classical approaches when using the Web for estimating concept probabilities.