Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Evaluating contents-link coupled web page clustering for web search results
Proceedings of the eleventh international conference on Information and knowledge management
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Utilizing hyperlink transitivity to improve web page clustering
ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
Clustering web pages based on their structure
Data & Knowledge Engineering - Special issue: WIDM 2003
Knowing a web page by the company it keeps
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
SCAN: a structural clustering algorithm for networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving density-based methods for hierarchical clustering of web pages
Data & Knowledge Engineering
Association Mining in Large Databases: A Re-examination of Its Measures
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Web page clustering using heuristic search in the web graph
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Web pages reordering and clustering based on web patterns
SOFSEM'08 Proceedings of the 34th conference on Current trends in theory and practice of computer science
MenuMiner: revealing the information architecture of large web sites by analyzing maximal cliques
Proceedings of the 21st international conference companion on World Wide Web
Hi-index | 0.00 |
Despite of the wide diversity of web-pages, web-pages residing in a particular organization, in most cases, are organized with semantically hierarchic structures For example, the website of a computer science department contains pages about its people, courses and research, among which pages of people are categorized into faculty, staff and students, and pages of research diversify into different areas Uncovering such hierarchic structures could supply users a convenient way of comprehensive navigation and accelerate other web mining tasks In this study, we extract a similarity matrix among pages via in-page and crosspage link structures, based on which a density-based clustering algorithm is developed, which hierarchically groups densely linked webpages into semantic clusters Our experiments show that this method is efficient and effective, and sheds light on mining and exploring web structures.