Algorithms for clustering data
Algorithms for clustering data
Incremental and interactive sequence mining
Proceedings of the eighth international conference on Information and knowledge management
A framework for constructing features and models for intrusion detection systems
ACM Transactions on Information and System Security (TISSEC)
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed processing of very large datasets with DataCutter
Parallel Computing - Clusters and computational grids for scientific computing
Applications of Data Mining in Computer Security
Applications of Data Mining in Computer Security
Shared State for Distributed Interactive Data Mining Applications
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Data management and transfer in high-performance computational grid environments
Parallel Computing - Parallel data-intensive algorithms and applications
Applying NetSolve's Network-Enabled Server
IEEE Computational Science & Engineering
Identifying Dynamic Replication Strategies for a High-Performance Data Grid
GRID '01 Proceedings of the Second International Workshop on Grid Computing
An Evaluation of Sampling-Based Size Estimation Methods for Selections in Database Systems
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Ninf: A Network Based Information Library for Global World-Wide Computing Infrastructure
HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
An Architecture for Distributed Enterprise Data Mining
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Giggle: a framework for constructing scalable replica location services
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Executing multiple pipelined data analysis operations in the grid
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
ADMIT: anomaly-based data mining for intrusions
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery net: towards a grid of knowledge discovery
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
High Performance Distributed Objects Using Caching Proxies for Large Scale Applications
DOA '99 Proceedings of the International Symposium on Distributed Objects and Applications
MotifMiner: A General Toolkit for Efficiently Identifying Common Substructures in Molecules
BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Optimizing Execution of Component-based Applications using Group Instances
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Armada: A Parallel File System for Computational Grids
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Managing Heterogeneous Resources in Data Mining Applications on Grids Using XML-Based Metadata
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Efficient Progressive Sampling for Association Rules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Deflating the Dimensionality Curse Using Multiple Fractal Dimensions
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
The Kangaroo Approach to Data Movement on the Grid
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
LOADED: Link-Based Outlier and Anomaly Detection in Evolving Data Sets
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Driving scientific applications by data in distributed environments
ICCS'03 Proceedings of the 2003 international conference on Computational science
How distributed data mining tasks can thrive as knowledge services
Communications of the ACM
Hi-index | 0.02 |
Over the past decade, advances in computational and sensor technology have enabled us to dynamically collect vast amounts of data from observations, health screening tests, simulations, and experiments at an ever-increasing pace. Knowledge discovery and data mining is an iterative process concerned with deriving interesting, non-obvious, and useful patterns and models from such large volumes of data. Although inexpensive storage is conducive to maintaining said data, accessing and managing it for knowledge discovery and data mining becomes a performance issue when datasets are large, dynamic, and distributed. In this work, we present our vision of a software framework consisting of middleware services to support interactive data mining over dynamic data at data analysis centers built on top of heterogeneous clusters. The design of a sampling service for dynamic data, together with initial performance results, are also presented.