Text relatedness based on a word thesaurus

Authors:
George Tsatsaronis;Iraklis Varlamis;Michalis Vazirgiannis
Affiliations:
Department of Computer and Information Science, Norwegian University of Science and Technology, Norway;Department of Informatics and Telematics, Harokopio University, Greece;Department of Informatics, Athens University of Economics and Business, Greece
Venue:
Journal of Artificial Intelligence Research
Year:
2010

Citing 60
Cited 24

Introduction to algorithms

Introduction to algorithms
Using WordNet to disambiguate word senses for text retrieval

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Word sense disambiguation for free-text indexing using a massive semantic network

CIKM '93 Proceedings of the second international conference on Information and knowledge management
Word sense disambiguation and information retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory

The nature of statistical learning theory
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
The teachable language comprehender: a simulation program and theory of language

Communications of the ACM
Contextual correlates of synonymy

Communications of the ACM
Placing search in context: the concept revisited

ACM Transactions on Information Systems (TOIS)
Information Retrieval

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
An evaluation of term dependence models in information retrieval

SIGIR '82 Proceedings of the 5th annual ACM conference on Research and development in information retrieval
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An information-theoretic perspective of tf—idf measures

Information Processing and Management: an International Journal
Word sense disambiguation in information retrieval revisited

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Lexical cohesion computed by thesaural relations as an indicator of the structure of text

Computational Linguistics
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Word sense disambiguation with very large neural networks extracted from machine readable dictionaries

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
A method for word sense disambiguation of unrestricted text

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Frequency estimates for statistical word similarity measures

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Corpus-based Learning of Analogies and Semantic Relations

Machine Learning
Paraphrase acquisition for information extraction

PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
Sentence Similarity Based on Semantic Nets and Corpus Statistics

IEEE Transactions on Knowledge and Data Engineering
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Similarity of Semantic Relations

Computational Linguistics
Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)

Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)
Methods for using textual entailment in open-domain question answering

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Corpus and evaluation measures for multiple document summarization with multiple sources

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
PageRank on semantic networks, with application to word sense disambiguation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Dependency-Based Construction of Semantic Space Models

Computational Linguistics
Estimators and tail bounds for dimension reduction in lα (0

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Semantic text similarity using corpus-based word similarity and string similarity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Ambiguous queries: test collections need more sense

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A structural approach to the automatic adjudication of word sense disagreements

Natural Language Engineering
WWW sits the SAT: Measuring Relational Similarity on the Web

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Corpus-based and knowledge-based measures of text semantic similarity

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
A uniform approach to analogies, synonyms, antonyms, and associations

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A generalized vector space model for text retrieval based on semantic relatedness

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Paraphrase recognition via dissimilarity significance classification

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Deriving a large scale taxonomy from Wikipedia

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
A study on similarity and relatedness using distributional and WordNet-based approaches

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Knowledge derived from wikipedia for computing semantic relatedness

Journal of Artificial Intelligence Research
The latent relation mapping engine: algorithm and experiments

Journal of Artificial Intelligence Research
Wikipedia-based semantic interpretation for natural language processing

Journal of Artificial Intelligence Research
Inferring strategies for sentence ordering in multidocument news summarization

Journal of Artificial Intelligence Research
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Word sense disambiguation with spreading activation networks generated from thesauri

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Extended gloss overlaps as a measure of semantic relatedness

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A comparative study of two short text semantic similarity measures

KES-AMSTA'08 Proceedings of the 2nd KES International conference on Agent and multi-agent systems: technologies and applications
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
A graph approach to measuring text distance

A graph approach to measuring text distance
A semantic kernel to exploit linguistic knowledge

AI*IA'05 Proceedings of the 9th conference on Advances in Artificial Intelligence
Word sense disambiguation for exploiting hierarchical thesauri in text classification

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Mining paraphrases from self-anchored web sentence fragments

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
English tasks: all-words and verb lexical sample

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems

SemanticRank: ranking keywords and sentences using semantic graphs

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
KDTA: automated knowledge-driven text annotation

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A survey of paraphrasing and textual entailment methods

Journal of Artificial Intelligence Research
Using properties to compare both words and clauses

KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
A knowledge-based semantic Kernel for text classification

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Fast supervised feature extraction by term discrimination information pooling

Proceedings of the 20th ACM international conference on Information and knowledge management
Scalable semantic annotation of text using lexical and web resources

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Efficient semantic kernel-based text classification using matching pursuit KFDA

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Efficient searching top-k semantic similar words

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Recognising sentence similarity using similitude and dissimilarity features

International Journal of Advanced Intelligence Paradigms
Extracting information networks from the blogosphere

ACM Transactions on the Web (TWEB)
Mining potential research synergies from co-authorship graphs using power graph analysis

International Journal of Web Engineering and Technology
Word sense disambiguation as an integer linear programming problem

SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Towards efficient similar sentences extraction

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
A simple unsupervised latent semantics based approach for sentence similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Modeling sentences in the latent space

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
SemaFor: semantic document indexing using semantic forests

Proceedings of the 21st ACM international conference on Information and knowledge management
Supervised learning of semantic relatedness

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
Transforming Wikipedia into a large scale multilingual concept network

Artificial Intelligence
Representations for multi-document event clustering

Data Mining and Knowledge Discovery
Prediction of relevance between requests and web services using ANN and LR models

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
A new benchmark dataset with production methodology for short text semantic similarity algorithms

ACM Transactions on Speech and Language Processing (TSLP)
Semantic smoothing for text clustering

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The computation of relatedness between two fragments of text in an automated manner requires taking into account a wide range of factors pertaining to the meaning the two fragments convey, and the pairwise relations between their words. Without doubt, a measure of relatedness between text segments must take into account both the lexical and the semantic relatedness between words. Such a measure that captures well both aspects of text relatedness may help in many tasks, such as text retrieval, classification and clustering. In this paper we present a new approach for measuring the semantic relatedness between words based on their implicit semantic links. The approach exploits only a word thesaurus in order to devise implicit semantic links between words. Based on this approach, we introduce Omiotis, a new measure of semantic relatedness between texts which capitalizes on the word-to-word semantic relatedness measure (SR) and extends it to measure the relatedness between texts. We gradually validate our method: we first evaluate the performance of the semantic relatedness measure between individual words, covering word-to-word similarity and relatedness, synonym identification and word analogy; then, we proceed with evaluating the performance of our method in measuring text-to-text semantic relatedness in two tasks, namely sentence-to-sentence similarity and paraphrase recognition. Experimental evaluation shows that the proposed method outperforms every lexicon-based method of semantic relatedness in the selected tasks and the used data sets, and competes well against corpus-based and hybrid approaches.