Randomized algorithms
Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Communications of the ACM
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Join-distinct aggregate estimation over update streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast sparse matrix multiplication
ACM Transactions on Algorithms (TALG)
Faster join-projects and sparse matrix multiplications
Proceedings of the 12th International Conference on Database Theory
On estimating path aggregates over streaming graphs
ISAAC'06 Proceedings of the 17th international conference on Algorithms and Computation
Compressed matrix multiplication
Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Space-round tradeoffs for MapReduce computations
Proceedings of the 26th ACM international conference on Supercomputing
Improved counter based algorithms for frequent pairs mining in transactional data streams
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Compressed matrix multiplication
ACM Transactions on Computation Theory (TOCT) - Special issue on innovations in theoretical computer science 2012
Hi-index | 0.00 |
We consider the problem of doing fast and reliable estimation of the number of non-zero entries in a sparse boolean matrix product. Let n denote the total number of non-zero entries in the input matrices. We show how to compute a 1 ± ε approximation (with small probability of error) in expected time O(n) for any ε 4/√4n. The previously best estimation algorithm, due to Cohen (JCSS 1997), uses time O(n/ε2). We also present a variant using O(sort(n)) I/Os in expectation in the cache-oblivious model. We also describe how sampling can be used to maintain (independent) sketches of matrices that allow estimation to be performed in time o(n) if z is sufficiently large. This gives a simpler alternative to the sketching technique of Ganguly et al. (PODS 2005), and matches a space lower bound shown in that paper.