Link Based Clustering of Web Search Results

Authors:
Yitong Wang;Masaru Kitsuregawa
Affiliations:
-;-
Venue:
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Year:
2001

Citing 15
Cited 8

Algorithms for clustering data

Algorithms for clustering data
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering

Proceedings of the the seventh ACM conference on Hypertext
Life, death, and lawfulness on the electronic frontier

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Database techniques for the World-Wide Web: a survey

ACM SIGMOD Record
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Grouper: a dynamic clustering interface to Web search results

WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

H-BayesClust: A New Hierarchical Clustering Based on Bayesian Networks

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Visualizing Blogsphere Using Content Based Clusters

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Using semantic techniques to access web data

Information Systems
A topology-driven approach to the design of web meta-search clustering engines

SOFSEM'05 Proceedings of the 31st international conference on Theory and Practice of Computer Science
A survey on statistical relational learning

AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
PROBABILISTIC HEURISTICS FOR HIERARCHICAL WEB DATA CLUSTERING

Computational Intelligence
Fine-grained topic detection in news search results

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Determining the titles of Web pages using anchor text and link analysis

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well as AI. Web search engine is the most commonly used tool for information retrieval; however, its current status is far from satisfaction. In this paper, we propose a new approach to cluster search results returned from Web search engine using link analysis. Unlike document clustering algorithms in IR that based on common words/phrases shared between documents, our approach is base on common links shared by pages using co-citation and coupling analysis. We also extend standard clustering algorithm K-means to make it more natural to handle noises and apply it to web search results. By filtering some irrelevant pages, our approach clusters high quality pages into groups to facilitate users' accessing and browsing. Preliminary experiments and evaluations are conducted to investigate its effectiveness. The experiment results show that clustering on web search results via link analysis is promising.