Node similarity in networked information spaces

Authors:
Wangzhong Lu;Jeannette Janssen;Evangelos Milios;Nathalie Japkowicz
Affiliations:
Dalhousie University, Halifax, Nova Scotia, Canada, B3H 3J5;Dalhousie University, Halifax, Nova Scotia, Canada, B3H 3J5;Dalhousie University, Halifax, Nova Scotia, Canada, B3H 3J5;Dalhousie University, Halifax, Nova Scotia, Canada, B3H 3J5
Venue:
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Year:
2001

Citing 3
Cited 11

An interactive system for finding complementary literatures: a stimulus to scientific discovery

Artificial Intelligence - Special issue on scientific discovery
CiteSeer: an automatic citation indexing system

Proceedings of the third ACM conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7

Algorithmic detection of semantic similarity

WWW '05 Proceedings of the 14th international conference on World Wide Web
Scaling link-based similarity search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Enhanced searching algorithms for relevant web pages using hyperlink graphs

Proceedings of the 43rd annual Southeast regional conference - Volume 1
Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs

IEEE Transactions on Knowledge and Data Engineering
Accuracy estimate and optimization techniques for SimRank computation

Proceedings of the VLDB Endowment
Relating web pages to enable information-gathering tasks

Proceedings of the 20th ACM conference on Hypertext and hypermedia
MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching

Proceedings of the 18th ACM conference on Information and knowledge management
Accuracy estimate and optimization techniques for SimRank computation

The VLDB Journal — The International Journal on Very Large Data Bases
Axiomatic ranking of network role similarity

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A scalable randomized method to compute link-based similarity rank on the web graph

EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
A new closeness metric for social networks based on the k shortest paths

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Networked information spaces contain information entities, corresponding to nodes, which are connected by associations, corresponding to links in the network. Examples of networked information spaces are: the World Wide Web, where information entities are web pages, and associations are hyperlinks: the scientific literature, where information entities are articles and associations are references to other articles. Similarity between information entities in a networked information space can be defined not only based on the content of the information entities, but also based on the connectivity established by the associations present. This paper explores the definition of similarity based on connectivity only, and proposes several algorithms for this purpose. Our metrics take advantage of the local neighborhoods of the nodes in the networked information space. Therefore, explicit availability of the networked information space is not required, as long as a query engine is available for following links and extracting the necessary local neighbourhoods for similarity estimation. Two variations of similarity estimation between two nodes are described, one based on the separate local neighbourhoods of the nodes, and another based on the joint local neighbourhood expanded from both nodes at the same time. The algorithms are implemented and evaluated on the citation graph of computer science. The immediate application of this work is in finding papers similar to a given paper in a digital library, but they are also applicable to other networked information spaces, such as the Web.