Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Partitioning sparse matrices with eigenvectors of graphs
SIAM Journal on Matrix Analysis and Applications
Laplace eigenvalues of graphs—a survey
Discrete Mathematics - Algebraic graph theory; a volume dedicated to Gert Sabidussi
A user-centred evaluation of ranking algorithms for interactive query expansion
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Adapting a full-text information retrieval system to the computer troubleshooting domain
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
An improved spectral graph partitioning algorithm for mapping parallel computations
SIAM Journal on Scientific Computing
Silk from a sow's ear: extracting usable structures from the Web
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Toward a Qualitative Search Engine
IEEE Internet Computing
Mining the Web's Link Structure
Computer
Extracting Large-Scale Knowledge Bases from the Web
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Normalized Cuts and Image Segmentation
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Spectral partitioning works: planar graphs and finite element meshes
FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Providing Government Information on the Interne: Experiences with THOMAS
Providing Government Information on the Interne: Experiences with THOMAS
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Combining link-based and content-based methods for web document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Multitype Features Coselection for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
The web structure of e-government - developing a methodology for quantitative evaluation
Proceedings of the 15th international conference on World Wide Web
Combining content and link for classification using matrix factorization
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
User-assisted similarity estimation for searching related web pages
Proceedings of the eighteenth conference on Hypertext and hypermedia
Learning multiple graphs for document recommendations
Proceedings of the 17th international conference on World Wide Web
An Approximate Distribution for the Normalized Cut
Journal of Mathematical Imaging and Vision
A Graph Clustering Algorithm Based on Minimum and Normalized Cut
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Finding topic trends in digital libraries
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Web page clustering using heuristic search in the web graph
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Attaining higher quality for density based algorithms
RR'07 Proceedings of the 1st international conference on Web reasoning and rule systems
A fuzzy bi-clustering approach to correlate web users and pages
International Journal of Knowledge and Web Intelligence
TRACEMIN-Fiedler: a parallel algorithm for computing the Fiedler vector
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Costco: robust content and structure constrained clustering of networked documents
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
A unified representation of web logs for mining applications
Information Retrieval
Clustering scientific literature using sparse citation graph analysis
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Improving semantic consistency of web sites by quantifying user intent
ICWE'05 Proceedings of the 5th international conference on Web Engineering
Local clustering of large graphs by approximate fiedler vectors
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
State aggregation in higher order markov chains for finding online communities
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
MenuMiner: revealing the information architecture of large web sites by analyzing maximal cliques
Proceedings of the 21st international conference companion on World Wide Web
Tensor Framework and Combined Symmetry for Hypertext Mining
Fundamenta Informaticae
User community discovery from multi-relational networks
Decision Support Systems
Hi-index | 0.03 |
With the exponential growth of information on the World Wide Web, there is great demand for developing efficient methods for effectively organizing the large amount of retrieved information. Document clustering plays an important role in information retrieval and taxonomy management for the Web. In this paper we examine three clustering methods: K-means, multi-level METIS, and the recently developed normalized-cut method using a new approach of combining textual information, hyperlink structure and co-citation relations into a single similarity metric. We found the normalized-cut method with the new similarity metric is particularly effective, as demonstrated on three datasets of web query results. We also explore some theoretical connections between the normalized-cut method and the K-means method.