An Analysis of Some Graph Theoretical Cluster Techniques

Authors:
J. Gary Augustson;Jack Minker
Affiliations:
University of Maryland, Computer Science Center, College Park, Maryland;University of Maryland, Computer Science Center, College Park, Maryland
Venue:
Journal of the ACM (JACM)
Year:
1970

Citing 6
Cited 41

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The Association Factor in Information Retrieval

Journal of the ACM (JACM)
Information Retrieval Based upon Latent Class Analysis

Journal of the ACM (JACM)
Semantic Clustering of Index Terms

Journal of the ACM (JACM)
Graph separability and word grouping

ACM '66 Proceedings of the 1966 21st national conference
Automatic Information Organization and Retrieval.

Automatic Information Organization and Retrieval.

An O(20.304n) Algorithm for Solving Maximum Independent Set Problem

IEEE Transactions on Computers
Facet: A procedure for the automated synthesis of digital systems

25 years of DAC Papers on Twenty-five years of electronic design automation
Edge concentration: a method for clustering directed graphs

SCM '89 Proceedings of the 2nd International Workshop on Software configuration management
Generation and search of clustered files

ACM Transactions on Database Systems (TODS)
Corrections to Bierstone's Algorithm for Generating Cliques

Journal of the ACM (JACM)
Recent Studies in Automatic Text Analysis and Document Retrieval

Journal of the ACM (JACM)
The Diclique Representation and Decomposition of Binary Relations

Journal of the ACM (JACM)
Local Feedback in Full-Text Retrieval Systems

Journal of the ACM (JACM)
Data clustering: a review

ACM Computing Surveys (CSUR)
Some approaches to best-match file searching

Communications of the ACM
Algorithm 457: finding all cliques of an undirected graph

Communications of the ACM
SALSA: the stochastic approach for link-structure analysis

ACM Transactions on Information Systems (TOIS)
Dynamic clustering procedures for bibliographic data

SIGIR '81 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrieval
C2P: Clustering based on Closest Pairs

Proceedings of the 27th International Conference on Very Large Data Bases
XPRESS: A Cell Layout Generator with Integrated Transistor Folding

EDTC '96 Proceedings of the 1996 European conference on Design and Test
Facet: A procedure for the automated synthesis of digital systems

DAC '83 Proceedings of the 20th Design Automation Conference
Joint congestion control: routing and media access control optimization via dual decomposition for ad hoc wireless networks

MSWiM '05 Proceedings of the 8th ACM international symposium on Modeling, analysis and simulation of wireless and mobile systems
Information storage and retrieval: a survey and functional description

ACM SIGIR Forum
Optimal Resource Allocation in Wireless Ad Hoc Networks: A Price-Based Approach

IEEE Transactions on Mobile Computing
Unsupervised analysis of activity sequences using event-motifs

Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
Dominant Sets and Pairwise Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sufficient rate constraints for QoS flows in ad-hoc networks

Ad Hoc Networks
Inference of feature grammars for feature-based modeling in CAD systems

Integrated Computer-Aided Engineering
Efficient algorithms for mining maximal valid groups

The VLDB Journal — The International Journal on Very Large Data Bases
A bandwidth management scheme support for real-time applications in wireless mesh networks

Proceedings of the 2008 ACM symposium on Applied computing
QoS-aware fair rate allocation in wireless mesh networks

Computer Communications
A scalable, parallel algorithm for maximal clique enumeration

Journal of Parallel and Distributed Computing
A game-theoretic approach to partial clique enumeration

Image and Vision Computing
A novel sequence representation for unsupervised analysis of human activities

Artificial Intelligence
A Clustering Heuristic for Line-Drawing Analysis

IEEE Transactions on Computers
Unsupervised active learning based on hierarchical graph-theoretic clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Price-based resource allocation in wireless ad hoc networks

IWQoS'03 Proceedings of the 11th international conference on Quality of service
A new spectral bound on the clique number of graphs

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
A new graph-theoretic approach to clustering and segmentation

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Exact approaches for integrated aircraft fleeting and routing at TunisAir

Computational Optimization and Applications
Listing all maximal cliques in large sparse real-world graphs

SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Hybrid minimal spanning tree and mixture of gaussians based clustering algorithm

FoIKS'06 Proceedings of the 4th international conference on Foundations of Information and Knowledge Systems
A framework for attack patterns' discovery in honeynet data

Digital Investigation: The International Journal of Digital Forensics & Incident Response
Towards hierarchical clustering

CSR'07 Proceedings of the Second international conference on Computer Science: theory and applications
A survey of graph theoretical approaches to image segmentation

Pattern Recognition
An n2 algorithm for determining the bridges of a graph

Information Processing Letters

Quantified Score

Hi-index	0.04

Visualization

Abstract

Several graph theoretic cluster techniques aimed at the automatic generation of thesauri for information retrieval systems are explored. Experimental cluster analysis is performed on a sample corpus of 2267 documents. A term-term similarity matrix is constructed for the 3950 unique terms used to index the documents. Various threshold values, T, are applied to the similarity matrix to provide a series of binary threshold matrices. The corresponding graph of each binary threshold matrix is used to obtain the term clusters.Three definitions of a cluster are analyzed: (1) the connected components of the threshold matrix; (2) the maximal complete subgraphs of the connected components of the threshold matrix; (3) clusters of the maximal complete subgraphs of the threshold matrix, as described by Gotlieb and Kumar.Algorithms are described and analyzed for obtaining each cluster type. The algorithms are designed to be useful for large document and index collections. Two algorithms have been tested that find maximal complete subgraphs. An algorithm developed by Bierstone offers a significant time improvement over one suggested by Bonner.For threshold levels T ≥ 0.6, basically the same clusters are developed regardless of the cluster definition used. In such situations one need only find the connected components of the graph to develop the clusters.