Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Randomized algorithms
Building a scalable and accurate copy detection mechanism
Proceedings of the first ACM international conference on Digital libraries
Communications of the ACM
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Size-estimation framework with applications to transitive closure and reachability
Journal of Computer and System Sciences
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Scalable Techniques for Mining Causal Structures
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Deriving High Confidence Rules from Spatial Data Using Peano Count Trees
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Local and Global Methods in Data Mining: Basic Techniques and Open Problems
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
MAMBO: Discovering Association Rules Based on Conditional Independencies
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
A case for associative peer to peer overlays
ACM SIGCOMM Computer Communication Review
Unified descriptive language for association rules in data mining
Second international workshop on Intelligent systems design and application
Interpretations of Association Rules by Granular Computing
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining confident co-location rules without a support threshold
Proceedings of the 2003 ACM symposium on Applied computing
A graph model for E-commerce recommender systems
Journal of the American Society for Information Science and Technology
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Automated support specification for efficient mining of interesting association rules
Journal of Information Science
Mining quantitative correlated patterns using an information-theoretic approach
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering based on similarity of subjects using integrated subject graph
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Finding highly correlated pairs efficiently with powerful pruning
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Maintaining stream statistics over multiscale sliding windows
ACM Transactions on Database Systems (TODS)
Association rules mining using heavy itemsets
Data & Knowledge Engineering
Associative search in peer to peer networks: Harnessing latent semantics
Computer Networks: The International Journal of Computer and Telecommunications Networking
Bottom-k sketches: better and more efficient estimation of aggregates
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Twain: Two-end association miner with precise frequent exhibition periods
ACM Transactions on Knowledge Discovery from Data (TKDD)
Summarizing data using bottom-k sketches
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Algorithms for clustering high dimensional and distributed data
Intelligent Data Analysis
Association-based similarity testing and its applications
Intelligent Data Analysis
Compressing large boolean matrices using reordering techniques
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Mining unexpected multidimensional rules
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
A scalable pattern mining approach to web graph compression with communities
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
On discovery of soft associations with "most" fuzzy quantifier for item promotion applications
Information Sciences: an International Journal
Correlated pattern mining in quantitative databases
ACM Transactions on Database Systems (TODS)
A probabilistic framework for fusing frame-based searches within a video copy detection system
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
SpotSigs: robust and efficient near duplicate detection in large web collections
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Waveprint: Efficient wavelet-based audio fingerprinting
Pattern Recognition
Learning to hash: forgiving hash functions and applications
Data Mining and Knowledge Discovery
Tighter estimation using bottom k sketches
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
A decision theoretic framework for analyzing binary hash-based content identification systems
Proceedings of the 8th ACM workshop on Digital rights management
Finding sporadic rules in the diagnosis of the Erythemato-Squamous diseases
Intelligent Data Analysis
Type-based categorization of relational attributes
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Fast error-tolerant search on very large texts
Proceedings of the 2009 ACM symposium on Applied Computing
On Optimal Rule Mining: A Framework and a Necessary and Sufficient Condition of Antimonotonicity
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Leveraging discarded samples for tighter estimation of multiple-set aggregates
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Event Correlations in Sensor Networks
ICCS 2009 Proceedings of the 9th International Conference on Computational Science
Media Meets Semantic Web --- How the BBC Uses DBpedia and Linked Data to Make Connections
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Automatic accuracy assessment via hashing in multiple-source environment
Expert Systems with Applications: An International Journal
HARRA: fast iterative hashed record linkage for large-scale data collections
Proceedings of the 13th International Conference on Extending Database Technology
Summary queries for frequent itemsets mining
Journal of Systems and Software
Connection network and optimization of interest metric for one-to-one marketing
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Mining frequent instances on workflows
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Information Sciences: an International Journal
Generalizing prefix filtering to improve set similarity joins
Information Systems
An efficient approach to clustering real-estate listings
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Enhancing graph database indexing by suffix tree structure
PRIB'10 Proceedings of the 5th IAPR international conference on Pattern recognition in bioinformatics
Product portfolio identification with data mining based on multi-objective GA
Journal of Intelligent Manufacturing
On dense pattern mining in graph streams
Proceedings of the VLDB Endowment
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
Theory and applications of b-bit minwise hashing
Communications of the ACM
SizeSpotSigs: an effective deduplicate algorithm considering the size of page content
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Efficient duplicate detection on cloud using a new signature scheme
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Mining top-k regular-frequent itemsets using database partitioning and support estimation
Expert Systems with Applications: An International Journal
Finding sporadic rules using apriori-inverse
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Redundant bit vectors for quickly searching high-dimensional regions
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
On approximation algorithms for data mining applications
Efficient Approximation and Online Algorithms
Valency based weighted association rule mining
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Weighted association rule mining using particle swarm optimization
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Weighted association rule mining via a graph based connectivity model
Information Sciences: an International Journal
Optimonotone Measures For Optimal Rule Discovery
Computational Intelligence
Automatic Item Weight Generation for Pattern Mining and its Application
International Journal of Data Warehousing and Mining
SkyDiver: a framework for skyline diversification
Proceedings of the 16th International Conference on Extending Database Technology
STRIP: stream learning of influence probabilities
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-k and priority sampling, set similarity and subset sums with minimal independence
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
BruteSuppression: a size reduction method for Apriori rule sets
Journal of Intelligent Information Systems
Learning theory analysis for association rules and sequential event prediction
The Journal of Machine Learning Research
Efficient estimation for high similarities using odd sketches
Proceedings of the 23rd international conference on World wide web
A local fingerprinting approach for audio copy detection
Signal Processing
Optimal Lower Bounds for Locality-Sensitive Hashing (Except When q is Tiny)
ACM Transactions on Computation Theory (TOCT)
Editorial: data mining in electronic commerce - support vs. confidence
Journal of Theoretical and Applied Electronic Commerce Research
Hi-index | 0.02 |
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a priori algorithm is only effective when the only rules of interest are relationships that occur very frequently. However, there are a number of applications, such as data mining, identification of similar web documents, clustering, and collaborative filtering, where the rules of interest have comparatively few instances in the data. In these cases, we must look for highly correlated items, or possibly even causal relationships between infrequent items. We develop a family of algorithms for solving this problem, employing a combination of random sampling and hashing techniques. We provide analysis of the algorithms developed and conduct experiments on real and synthetic data to obtain a comparative performance analysis.