Probabilistic models in information retrieval
The Computer Journal - Special issue on information retrieval
Computer Evaluation of Indexing and Text Processing
Journal of the ACM (JACM)
Modern Information Retrieval
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
SimRank: a measure of structural-context similarity
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Mining long-term search history to improve search accuracy
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A novel document similarity measure based on earth mover's distance
Information Sciences: an International Journal
Recommending citations for academic papers
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
P-Rank: a comprehensive structural similarity measure over information networks
Proceedings of the 18th ACM conference on Information and knowledge management
Using Kullback-Leibler distance for text categorization
ECIR'03 Proceedings of the 25th European conference on IR research
On computing text-based similarity in scientific literature
Proceedings of the 20th international conference companion on World wide web
Detecting outlier sections in us congressional legislation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Associative tag recommendation exploiting multiple textual features
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
When documents are very long, BM25 fails!
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
In computing the similarity of scientific papers, text-based and link-based similarity measures look at only a single side of the content or citations. In this paper, we propose a new approach to compute the similarity of scientific papers accurately by combining the text-based and link-based similarity measures. Our proposed method considers the content and citations of the scientific papers simultaneously and combines the similarity scores based on the content and citations by using SVMrank. The effectiveness of our proposed method is demonstrated via extensive experiments on a real-world dataset of scientific papers. The results show that more than 20% improvement in accuracy is obtained with our approach compared with previous methods.