On Combining Link and Contents Information for Web Page Clustering

Authors:
Yitong Wang;Masaru Kitsuregawa
Affiliations:
-;-
Venue:
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Year:
2002

Citing 14
Cited 9

Algorithms for clustering data

Algorithms for clustering data
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering

Proceedings of the the seventh ACM conference on Hypertext
Life, death, and lawfulness on the electronic frontier

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Grouper: a dynamic clustering interface to Web search results

WWW '99 Proceedings of the eighth international conference on World Wide Web
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Use Link-Based Clustering to Improve Web Search Results

WISE '01 Proceedings of the Second International Conference on Web Information Systems Engineering (WISE'01) Volume 1 - Volume 1

A personalized search engine based on web-snippet hierarchical clustering

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Classifying search engine queries using the web as background knowledge

ACM SIGKDD Explorations Newsletter
A personalized search engine based on Web-snippet hierarchical clustering

Software—Practice & Experience
Improving density-based methods for hierarchical clustering of web pages

Data & Knowledge Engineering
A survey of Web clustering engines

ACM Computing Surveys (CSUR)
Density link-based methods for clustering web pages

Decision Support Systems
A new approach in dynamic prediction for user based web page crawling

Proceedings of the International Conference on Management of Emergent Digital EcoSystems
A unified representation of web logs for mining applications

Information Retrieval
PROBABILISTIC HEURISTICS FOR HIERARCHICAL WEB DATA CLUSTERING

Computational Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is currently one of the most crucial techniques for dealing (e.g. resources locating, information interpreting) with massive amount of heterogeneous information on the web, which is beyond human being's capacity to digest. In this paper, we discuss the shortcomings of pervious approaches and present a unifying clustering algorithm to cluster web search results for a specific query topic by combining link and contents information. Especially, we investigate how to combine link and contents analysis in clustering process to improve the quality and interpretation of web search results .The proposed approach automatically clusters the web search results into high quality, semantically meaningful groups in a concise, easy-to-interpret hierarchy with tagging terms. Preliminary experiments and evaluations are conducted and the experimental results show that the proposed approach is effective and promising. Keywords: co-citation, coupling, anchor window, snippet