Min-wise independent permutations (extended abstract)
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Reductions in streaming algorithms, with an application to counting triangles in graphs
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
An improved data stream algorithm for frequency moments
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Counting triangles in data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Counting of Triangles in Large Real Networks without Counting: Algorithms and Laws
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Mining Large Networks with Subgraph Counting
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
DOULION: counting triangles in massive graphs with a coin
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating clustering indexes in data streams
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Efficient algorithms for large-scale local triangle counting
ACM Transactions on Knowledge Discovery from Data (TKDD)
Exponential time improvement for min-wise based algorithms
Information and Computation
Tight bounds for Lp samplers, finding duplicates in streams, and related problems
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
New streaming algorithms for counting triangles in graphs
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Colorful triangle counting and a MapReduce implementation
Information Processing Letters
Graph sketches: sparsification, spanners, and subgraphs
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Multiplying matrices faster than coppersmith-winograd
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Streaming algorithms measured in terms of the computed quantity
COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Parallel triangle counting in massive streaming graphs
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Due to a large number of applications, the problem of estimating the number of triangles in graphs revealed as a stream of edges, and the closely related problem of estimating the graph's clustering coefficient, have received considerable attention in the last decade. Both efficient algorithms and impossibility results have shed light on the computational complexity of the problem. Motivated by applications in Web mining, Becchetti et al.~presented new algorithms for the estimation of the local number of triangles, i.e., the number of triangles incident to individual vertices. The algorithms are shown, both theoretically and experimentally, to efficiently handle the problem. However, at least two passes over the data are needed and thus the algorithms are not suitable for real streaming scenarios. In the present work, we consider the problem of estimating the clustering coefficient of individual vertices in a graph over n vertices revealed as a stream of m edges. As a first result we show that any one pass randomized streaming algorithm that can distinguish a graph with no triangles from a graph having a vertex of degree d with clustering coefficient 1/2 must use Ω(m/d) bits of space in expectation. Our second result is a new randomized one pass algorithm estimating the local clustering coefficient of each vertex with degree at least d. The space requirement of our algorithm is within a logarithmic factor of the lower bound, thus our approach is close to optimal. We also extend the algorithm to local triangle counting and report experimental results on its performance on real-life graphs.