Communications of the ACM
Dynamic Perfect Hashing: Upper and Lower Bounds
SIAM Journal on Computing
GIGAswitch system: a high-performance packet-switching platform
Digital Technical Journal
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
The AltaVista Revolution: How to Find Anything on the Internet
The AltaVista Revolution: How to Find Anything on the Internet
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Pairwise Independence and Derandomization
Pairwise Independence and Derandomization
A small approximately min-wise independent family of hash functions
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Selectively estimation for Boolean queries
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On permutations with limited independence
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Min-Wise versus linear independence (extended abstract)
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Efficient and tumble similar set retrieval
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Finding large independent sets of hypergraphs in parallel
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Sampling algorithms: lower bounds and applications
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Evaluating strategies for similarity search on the web
Proceedings of the 11th international conference on World Wide Web
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
An Approximate Lp-Difference Algorithm for Massive Data Streams
STACS '00 Proceedings of the 17th Annual Symposium on Theoretical Aspects of Computer Science
A Derandomization Using Min-Wise Independent Permutations
RANDOM '98 Proceedings of the Second International Workshop on Randomization and Approximation Techniques in Computer Science
Identifying and Filtering Near-Duplicate Documents
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Algorithmic aspects of information retrieval on the web
Handbook of massive data sets
On the sample size of k-restricted min-wise independent permutations and other k-wise distributions
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Generalized substring selectivity estimation
Journal of Computer and System Sciences - Special issue on PODS 2000
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A derandomization using min-wise independent permutations
Journal of Discrete Algorithms
Tracking set-expression cardinalities over continuous update streams
The VLDB Journal — The International Journal on Very Large Data Bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
p2pDating: Real life inspired semantic overlay networks for Web search
Information Processing and Management: an International Journal
Detecting near-duplicates for web crawling
Proceedings of the 16th international conference on World Wide Web
Detectives: detecting coalition hit inflation attacks in advertising networks streams
Proceedings of the 16th international conference on World Wide Web
Counting distinct items over update streams
Theoretical Computer Science
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Computational Linguistics
Disorder inequality: a combinatorial approach to nearest neighbor search
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Estimators and tail bounds for dimension reduction in lα (0
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Efficient semi-streaming algorithms for local triangle counting in massive graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Computing Frequent Elements Using Gossip
SIROCCO '08 Proceedings of the 15th international colloquium on Structural Information and Communication Complexity
Efficiently matching sets of features with random histograms
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Combinatorial algorithms for nearest neighbors, near-duplicates and small-world design
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Optimized union of non-disjoint distributed data sets
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Perturbed identity matrices have high rank: Proof and applications
Combinatorics, Probability and Computing
The design of a similarity based deduplication system
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Applying syntactic similarity algorithms for enterprise information management
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating the confidence of conditional functional dependencies
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Frequent Itemset Mining for Clustering Near Duplicate Web Documents
ICCS '09 Proceedings of the 17th International Conference on Conceptual Structures: Conceptual Structures: Leveraging Semantic Technologies
Combinatorial Framework for Similarity Search
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Automatic retrieval of similar content using search engine query interface
Proceedings of the 18th ACM conference on Information and knowledge management
A distributed placement service for graph-structured and tree-structured data
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
An incremental clustering scheme for data de-duplication
Data Mining and Knowledge Discovery
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
Efficient algorithms for large-scale local triangle counting
ACM Transactions on Knowledge Discovery from Data (TKDD)
A lightweight privacy preserving SMS-based recommendation system for mobile users
Proceedings of the fourth ACM conference on Recommender systems
A locality-sensitive hash for real vectors
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Identifying frequent items in a network using gossip
Journal of Parallel and Distributed Computing
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Exponential time improvement for min-wise based algorithms
Information and Computation
PRESIDIO: A Framework for Efficient Archival Data Storage
ACM Transactions on Storage (TOS)
Local graph sparsification for scalable clustering
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
ATLAS: a probabilistic algorithm for high dimensional similarity search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
XStreamCluster: an efficient algorithm for streaming XML data clustering
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Probabilistic near-duplicate detection using simhash
Proceedings of the 20th ACM international conference on Information and knowledge management
Distinct estimate of set expressions over sliding windows
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Algorithms for satisfiability using independent sets of variables
SAT'04 Proceedings of the 7th international conference on Theory and Applications of Satisfiability Testing
Compact features for detection of near-duplicates in distributed retrieval
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
IQN routing: integrating quality and novelty in P2P querying and ranking
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Counting distinct items over update streams
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Exponential time improvement for min-wise based algorithms
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Bayesian locality sensitive hashing for fast similarity search
Proceedings of the VLDB Endowment
Efficient semantic-aware detection of near duplicate resources
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
On approximation algorithms for data mining applications
Efficient Approximation and Online Algorithms
WAN optimized replication of backup datasets using stream-informed delta compression
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
CRSI: a compact randomized similarity index for set-valued features
Proceedings of the 15th International Conference on Extending Database Technology
Survey: Urban pervasive applications: Challenges, scenarios and case studies
Computer Science Review
Compact hashing for mixed image-keyword query over multi-label images
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
A probabilistic model for multimodal hash function learning
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Scalable mining of common routes in mobile communication network traffic data
Pervasive'12 Proceedings of the 10th international conference on Pervasive Computing
WAN-optimized replication of backup datasets using stream-informed delta compression
ACM Transactions on Storage (TOS)
KORE: keyphrase overlap relatedness for entity disambiguation
Proceedings of the 21st ACM international conference on Information and knowledge management
Being picky: processing top-k queries with set-defined selections
Proceedings of the 21st ACM international conference on Information and knowledge management
On the streaming complexity of computing local clustering coefficients
Proceedings of the sixth ACM international conference on Web search and data mining
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Comparing apples to oranges: a scalable solution with heterogeneous hashing
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Real-time recommendation of diverse related articles
Proceedings of the 22nd international conference on World Wide Web
Efficient community detection in large networks using content and links
Proceedings of the 22nd international conference on World Wide Web
NIFTY: a system for large scale information flow tracking and clustering
Proceedings of the 22nd international conference on World Wide Web
A distributed framework for scaling Up LSH-based computations in privacy preserving record linkage
Proceedings of the 6th Balkan Conference in Informatics
Searching similar segments over textual event sequences
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Locality sensitive hashing revisited: filling the gap between theory and algorithm analysis
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
b-bit minwise hashing in practice
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Mixed image-keyword query adaptive hashing over multilabel images
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Reduce and aggregate: similarity ranking in multi-categorical bipartite graphs
Proceedings of the 23rd international conference on World wide web
Towards large-scale geometry indexing by feature selection
Computer Vision and Image Understanding
EsPRESSO: Efficient privacy-preserving evaluation of sample set similarity
Journal of Computer Security
Hi-index | 0.00 |