Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Fast discovery of association rules
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
On the Discovery of Interesting Patterns in Association Rules
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Clustering Large Datasets in Arbitrary Metric Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Scalability for clustering algorithms revisited
ACM SIGKDD Explorations Newsletter
H-BLOB: a hierarchical visual clustering method using implicit surfaces
Proceedings of the conference on Visualization '00
Tri-plots: scalable tools for multidimensional data mining
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining by Means of Binary Representation: A Model for Similarity and Clustering
Information Systems Frontiers
Sampling Strategies for Mining in Data-Scarce Domains
Computing in Science and Engineering
An Open Framework for Smart and Personalized Distance Learning
ICWL '02 Proceedings of the First International Conference on Advances in Web-Based Learning
Parallel Data Mining on ATM-Connected PC Cluster and Optimization of Its Execution Environments
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
The Application of Case Based Reasoning on Q&A System
AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Enhancing the Apriori Algorithm for Frequent Set Counting
DaWaK '01 Proceedings of the Third International Conference on Data Warehousing and Knowledge Discovery
T3: A Classification Algorithm for Data Mining
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
Parallel Fuzzy c-Means Clustering for Large Data Sets
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Handbook of data mining and knowledge discovery
On finding common neighborhoods in massive graphs
Theoretical Computer Science
Identifying Candidate Disease Genes with High-Performance Computing
The Journal of Supercomputing
Turning CARTwheels: an alternating algorithm for mining redescriptions
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed approximate mining of frequent patterns
Proceedings of the 2005 ACM symposium on Applied computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable Model-Based Clustering for Large Databases Based on Data Summarization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable visual assessment of cluster tendency for large data sets
Pattern Recognition
Analysing users' access logs in Moodle to improve e learning
EATIS '07 Proceedings of the 2007 Euro American conference on Telematics and information systems
Approximate mining of frequent patterns on streams
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Customer analytics projects: addressing existing problems with a process that leads to success
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Prototype Proliferation in the Growing Neural Gas Algorithm
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part II
New results for finding common neighborhoods in massive graphs in the data stream model
Theoretical Computer Science
A scalable framework for cluster ensembles
Pattern Recognition
ODMCA: An adaptive data mining control algorithm in multicarrier networks
Computer Communications
Extending fuzzy and probabilistic clustering to very large data sets
Computational Statistics & Data Analysis
An efficient parallel and distributed algorithm for counting frequent sets
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Classification of software artifacts based on structural information
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part IV
Distributed Multi-Feature Recognition Scheme for Greyscale Images
Neural Processing Letters
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
An efficient distributed algorithm for mining association rules
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Data mining for diagnostic debugging in sensor networks: preliminary evidence and lessons learned
Sensor-KDD'08 Proceedings of the Second international conference on Knowledge Discovery from Sensor Data
An Efficient Method for Discretizing Continuous Attributes
International Journal of Data Warehousing and Mining
Hi-index | 4.10 |
Established companies have had decades to accumulate masses of data about their customers, suppliers, products and services, and employees. Data mining, also known as knowledge discovery in databases, gives organizations the tools to sift through these vast data stores to find the trends, patterns, and correlations that can guide strategic decision making. Traditionally, algorithms for data analysis assume that the input data contains relatively few records. Current databases, however, are much too large to be held in main memory. To be efficient, the data-mining techniques applied to very large databases must be highly scalable. An algorithm is said to be scalable if--given a fixed amount of main memory--its runtime increases linearly with the number of records in the input database. Recent work has focused on scaling data-mining algorithms to very large data sets. In this survey, the authors describe a broad range of algorithms that address three classical data-mining problems: market basket analysis, clustering, and classification.