A Random Walk Framework to Compute Textual Semantic Similarity: A Unified Model for Three Benchmark Tasks

Authors:
Majid Yazdani;Andrei Popescu-Belis
Affiliations:
-;-
Venue:
ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
Year:
2010

Citing 0
Cited 6

A speech-based just-in-time retrieval system using semantic search

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
Using a Wikipedia-based semantic relatedness measure for document clustering

TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
A just-in-time document retrieval system for dialogues or monologues

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Harnessing different knowledge sources to measure semantic relatedness under a uniform model

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Artificial Intelligence
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia: extended abstract

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

A network of concepts is built from Wikipedia documents using a random walk approach to compute distances between documents. Three algorithms for distance computation are considered: hitting/commute time, personalized page rank, and truncated visiting probability. In parallel, four types of weighted links in the document network are considered: actual hyperlinks, lexical similarity, common category membership, and common template use. The resulting network is used to solve three benchmark semantic tasks – word similarity, paraphrase detection between sentences, and document similarity – by mapping pairs of data to the network, and then computing a distance between these representations. The model reaches state-of-the-art performance on each task, showing that the constructed network is a general, valuable resource for semantic similarity judgments.