Computing semantic relatedness using word frequency and layout information of Wikipedia

Authors:
Patrick Chan;Yoshinori Hijikata;Shogo Nishida
Affiliations:
Osaka University, Osaka, Japan;Osaka University, Osaka, Japan;Osaka University, Osaka, Japan
Venue:
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Year:
2013

Citing 8
Cited 0

Placing search in context: the concept revisited

ACM Transactions on Information Systems (TOIS)
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Association thesaurus construction methods based on link co-occurrence analysis for wikipedia

Proceedings of the 17th ACM conference on Information and knowledge management
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Similarity measures for short segments of text

ECIR'07 Proceedings of the 29th European conference on IR research
Wikipedia mining for an association web thesaurus construction

WISE'07 Proceedings of the 8th international conference on Web information systems engineering
A word at a time: computing word relatedness using temporal semantic analysis

Proceedings of the 20th international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing the semantic relatedness between two words or phrases is an important problem for fields such as information retrieval and natural language processing. One state-of-the-art approach to solve the problem is Explicit Semantic Analysis (ESA). ESA uses the word frequency in Wikipedia articles to estimate the relevance, so the relevance of words with low frequency cannot always be well estimated. To improve the relevance estimate of the low frequency words, we use not only word frequency but also layout information in Wikipedia articles. Empirical evaluation shows that on the low frequency words, our method achieves better estimate of semantic relatedness over ESA.