An interactive system for finding complementary literatures: a stimulus to scientific discovery
Artificial Intelligence - Special issue on scientific discovery
CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Algorithmic detection of semantic similarity
WWW '05 Proceedings of the 14th international conference on World Wide Web
Scaling link-based similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Enhanced searching algorithms for relevant web pages using hyperlink graphs
Proceedings of the 43rd annual Southeast regional conference - Volume 1
Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs
IEEE Transactions on Knowledge and Data Engineering
Accuracy estimate and optimization techniques for SimRank computation
Proceedings of the VLDB Endowment
Relating web pages to enable information-gathering tasks
Proceedings of the 20th ACM conference on Hypertext and hypermedia
MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching
Proceedings of the 18th ACM conference on Information and knowledge management
Accuracy estimate and optimization techniques for SimRank computation
The VLDB Journal — The International Journal on Very Large Data Bases
Axiomatic ranking of network role similarity
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A scalable randomized method to compute link-based similarity rank on the web graph
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
A new closeness metric for social networks based on the k shortest paths
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part II
Hi-index | 0.00 |
Networked information spaces contain information entities, corresponding to nodes, which are connected by associations, corresponding to links in the network. Examples of networked information spaces are: the World Wide Web, where information entities are web pages, and associations are hyperlinks: the scientific literature, where information entities are articles and associations are references to other articles. Similarity between information entities in a networked information space can be defined not only based on the content of the information entities, but also based on the connectivity established by the associations present. This paper explores the definition of similarity based on connectivity only, and proposes several algorithms for this purpose. Our metrics take advantage of the local neighborhoods of the nodes in the networked information space. Therefore, explicit availability of the networked information space is not required, as long as a query engine is available for following links and extracting the necessary local neighbourhoods for similarity estimation. Two variations of similarity estimation between two nodes are described, one based on the separate local neighbourhoods of the nodes, and another based on the joint local neighbourhood expanded from both nodes at the same time. The algorithms are implemented and evaluated on the citation graph of computer science. The immediate application of this work is in finding papers similar to a given paper in a digital library, but they are also applicable to other networked information spaces, such as the Web.