SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Using a knowledge cache for interactive discovery of association rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental and interactive sequence mining
Proceedings of the eighth international conference on Information and knowledge management
Influence sets based on reverse nearest neighbor queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
CSV: visualizing and mining cohesive subgraphs
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Fast Similarity Join Algorithm Using Graphics Processing Units
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
DOULION: counting triangles in massive graphs with a coin
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Bayesian locality sensitive hashing for fast similarity search
Proceedings of the VLDB Endowment
V-SMART-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors
Proceedings of the VLDB Endowment
Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Hi-index | 0.00 |
Rapidly making sense of, analyzing, and extracting useful information from large and complex data is a grand challenge. A user tasked with meeting this challenge is often befuddled with questions on where and how to begin to understand the relevant characteristics of such data. Real-world problem scenarios often involve scalability limitations and time constraints. In this paper we present an incremental interactive data analysis system as a step to address this challenge. This system builds on recent progress in the fields of interactive data exploration, locality sensitive hashing, knowledge caching, and graph visualization. Using visual clues based on rapid incremental estimates, a user is provided a multi-level capability to probe and interrogate the intrinsic structure of data. Throughout the interactive process, the output of previous probes can be used to construct increasingly tight coherence estimates across the parameter space, providing strong hints to the user about promising analysis steps to perform next. We present examples, interactive scenarios, and experimental results on several synthetic and real-world datasets which show the effectiveness and efficiency of our approach. The implications of this work are quite broad and can impact fields ranging from top-k algorithms to data clustering and from manifold learning to similarity search.