CYC: a large-scale investment in knowledge infrastructure
Communications of the ACM
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
IEEE Transactions on Knowledge and Data Engineering
The Journal of Machine Learning Research
The link prediction problem for social networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast incremental proximity search in large graphs
Proceedings of the 25th international conference on Machine learning
Query suggestion using hitting time
Proceedings of the 17th ACM conference on Information and knowledge management
Information theoretic measures for clusterings comparison: is a correction for chance necessary?
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
WikiRelate! computing semantic relatedness using wikipedia
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Corpus-based and knowledge-based measures of text semantic similarity
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Personalizing PageRank for word sense disambiguation
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
WikiWalk: random walks on Wikipedia for semantic relatedness
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
ICSC '10 Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing
Proceedings of the sixth ACM international conference on Web search and data mining
Hi-index | 0.00 |
A graph-based distance between Wikipedia articles is defined using a random walk model, which estimates visiting probability (VP) between articles using two types of links: hyperlinks and lexical similarity relations. The VP to and from a set of articles is then computed, and approximations are proposed to make tractable the computation of semantic relatedness between every two texts in a large data set. The model is applied to document clustering on the 20 Newsgroups data set. Precision and recall are improved in comparison with previous textual distance algorithms.