Using Random Walks for Mining Web Document Associations

Authors:
K. Selçuk Candan;Wen-Syan Li
Affiliations:
-;-
Venue:
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Year:
2000

Citing 9
Cited 4

Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Mirror, mirror on the Web: a study of host pairs with replicated content

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Integrating content search with structure analysis for hypermedia retrieval and management

ACM Computing Surveys (CSUR)

Enabling access-privacy for random walk based data analysis applications

Data & Knowledge Engineering
R2DF framework for ranked path queries over weighted RDF graphs

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Impact neighborhood indexing (INI) in diffusion graphs

Proceedings of the 21st ACM international conference on Information and knowledge management
LR-PPR: locality-sensitive, re-use promoting, approximate personalized pagerank computation

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

World Wide Web has emerged as a primary means for storing and structuring information. In this paper, we present a framework for mining implicit associations among Web documents. We focus on the following problem: "For a given set of seed URLs, find a list of Web pages which reflect the association among these seeds." In the proposed framework, associations of two documents are induced by the connectivity and linking path length. Based on this framework, we have developed a random walk-based Web mining technique and validated it by experiments on real Web data. In this paper, we also discuss the extension of the algorithm for considering document contents.