Finding cohesive clusters for analyzing knowledge communities

Authors:
Vasileios Kandylas;S. Phineas Upham;Lyle H. Ungar
Affiliations:
University of Pennsylvania, Department of Computer and Information Science, 19104, Philadelphia, PA, USA;University of Pennsylvania, Wharton School, Philadelphia, PA, USA;University of Pennsylvania, Department of Computer and Information Science, 19104, Philadelphia, PA, USA
Venue:
Knowledge and Information Systems
Year:
2008

Citing 15
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Efficient identification of Web communities

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering with committees

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Paradigms, citations, and maps of science: a personal history

Journal of the American Society for Information Science and Technology
Clustering and Identifying Temporal Trends in Document Databases

ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
Foreground/background segmentation of color images by integration of multiple cues

ICIP '95 Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Information Theoretic Clustering of Sparse Co-Occurrence Data

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Natural communities in large linked networks

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting relational structure to understand publication patterns in high-energy physics

ACM SIGKDD Explorations Newsletter
Dynamic topic models

ICML '06 Proceedings of the 23rd international conference on Machine learning
Topics over time: a non-Markov continuous-time model of topical trends

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
An information-theoretic analysis of hard and soft assignment methods for clustering

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Asymmetric information distances for automated taxonomy construction

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Documents and authors can be clustered into “knowledge communities” based on the overlap in the papers they cite. We introduce a new clustering algorithm, Streemer, which finds cohesive foreground clusters embedded in a diffuse background, and use it to identify knowledge communities as foreground clusters of papers which share common citations. To analyze the evolution of these communities over time, we build predictive models with features based on the citation structure, the vocabulary of the papers, and the affiliations and prestige of the authors. Findings include that scientific knowledge communities tend to grow more rapidly if their publications build on diverse information and if they use a narrow vocabulary.