BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Accelerating exact k-means algorithms with geometric reasoning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Density biased sampling: an improved method for data mining and clustering
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Local search heuristic for k-median and facility location problems
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Primal-Dual Approximation Algorithms for Metric Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Issues in data stream management
ACM SIGMOD Record
Distributed deviation detection in sensor networks
ACM SIGMOD Record
Cost-efficient mining techniques for data streams
ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
Efficient estimation algorithms for neighborhood variance and other moments
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
A New Conceptual Clustering Framework
Machine Learning
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On joining and caching stochastic streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
ACM SIGMOD Record
Making Subsequence Time Series Clustering Meaningful
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Maintaining significant stream statistics over sliding windows
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Energy-efficient monitoring of extreme values in sensor networks
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Sketching asynchronous streams over a sliding window
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An energy-efficient querying framework in sensor networks for detecting node similarities
Proceedings of the 9th ACM international symposium on Modeling analysis and simulation of wireless and mobile systems
Maintaining stream statistics over multiscale sliding windows
ACM Transactions on Database Systems (TODS)
Making clustering in delay-vector space meaningful
Knowledge and Information Systems
Effective variation management for pseudo periodical streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Variance estimation over sliding windows
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Adaptive similarity search in streaming time series with sliding windows
Data & Knowledge Engineering
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Scout: a game speed analysis and tracking system
Machine Vision and Applications
Continuous subspace clustering in streaming time series
Information Systems
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Neighbor-based pattern detection for windows over streaming data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Evaluating algorithms that learn from data streams
Proceedings of the 2009 ACM symposium on Applied Computing
PGG: an online pattern based approach for stream variation management
Journal of Computer Science and Technology
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
Density-based clustering of data streams at multiple resolutions
ACM Transactions on Knowledge Discovery from Data (TKDD)
Issues in evaluation of stream learning algorithms
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Optimal sampling from sliding windows
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Small synopses for group-by query verification on outsourced data streams
ACM Transactions on Database Systems (TODS)
Competitive Analysis of Aggregate Max in Windowed Streaming
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Incremental and Adaptive Clustering Stream Data over Sliding Window
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
A shared execution strategy for multiple pattern mining requests over streaming data
Proceedings of the VLDB Endowment
SCALE: a scalable framework for efficiently clustering transactional data
Data Mining and Knowledge Discovery
Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Density-based data streams clustering over sliding windows
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
MG-join: detecting phenomena and their correlation in high dimensional data streams
Distributed and Parallel Databases
Increasing availability of industrial systems through data stream mining
Computers and Industrial Engineering
Time-decaying Sketches for Robust Aggregation of Sensor Data
SIAM Journal on Computing
Effective Computations on Sliding Windows
SIAM Journal on Computing
Index design and query processing for graph conductance search
The VLDB Journal — The International Journal on Very Large Data Bases
Online and offline trend cluster discovery in spatially distributed data streams
MSM'10/MUSE'10 Proceedings of the 2010 international conference on Analysis of social media and ubiquitous data
CLUES: a unified framework supporting interactive exploration of density-based clusters in streams
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimal sampling from sliding windows
Journal of Computer and System Sciences
Shared execution strategy for neighbor-based pattern mining requests over streaming windows
ACM Transactions on Database Systems (TODS)
Granularity adaptive density estimation and on demand clustering of concept-drifting data streams
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Scalable clustering using graphics processors
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
User subjectivity in change modeling of streaming itemsets
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Querying sliding windows over online data streams
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
On approximation algorithms for data mining applications
Efficient Approximation and Online Algorithms
Clustering distributed data streams in peer-to-peer environments
Information Sciences: an International Journal
Clustering transactional data streams
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Scalable similarity matching in streaming time series
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Mining neighbor-based patterns in data streams
Information Systems
A single pass algorithm for clustering evolving data streams based on swarm intelligence
Data Mining and Knowledge Discovery
On evaluating stream learning algorithms
Machine Learning
Decision support based needs assessment for cancer patients
HIKM '11 Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 120
Real time processing of data from patient biodevices
HIKM '11 Proceedings of the Fourth Australasian Workshop on Health Informatics and Knowledge Management - Volume 120
Optimised X-HYBRIDJOIN for near-real-time data warehousing
ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Data stream clustering: A survey
ACM Computing Surveys (CSUR)
Model-based clustering of high-dimensional data streams with online mixture of probabilistic PCA
Advances in Data Analysis and Classification
Survey of Clustering: Algorithms and Applications
International Journal of Information Retrieval Research
On clustering large number of data streams
Intelligent Data Analysis
Hi-index | 0.00 |
The sliding window model is useful for discounting stale data in data stream applications. In this model, data elements arrive continually and only the most recent N elements are used when answering queries. We present a novel technique for solving two important and related problems in the sliding window model---maintaining variance and maintaining a k--median clustering. Our solution to the problem of maintaining variance provides a continually updated estimate of the variance of the last N values in a data stream with relative error of at most ε using O(1/ε2 log N) memory. We present a constant-factor approximation algorithm which maintains an approximate k--median solution for the last N data points using O(k/τ4 N2τ log2 N) memory, where τ O(2O(1/τ)).