PLASMA-HD: probing the lattice structure and makeup of high-dimensional data

Authors:
David Fuhry;Yang Zhang;Venu Satuluri;Arnab Nandi;Srinivasan Parthasarathy
Affiliations:
Department of Computer Science and Engineering, The Ohio State University, Columbus, OH;Department of Computer Science and Engineering, The Ohio State University, Columbus, OH;Twitter and Department of Computer Science and Engineering, The Ohio State University, Columbus, OH;Department of Computer Science and Engineering, The Ohio State University, Columbus, OH;Department of Computer Science and Engineering, The Ohio State University, Columbus, OH
Venue:
Proceedings of the VLDB Endowment
Year:
2013

Citing 12
Cited 0

Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Using a knowledge cache for interactive discovery of association rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental and interactive sequence mining

Proceedings of the eighth international conference on Information and knowledge management
Influence sets based on reverse nearest neighbor queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Communications of the ACM - 50th anniversary issue: 1958 - 2008
CSV: visualizing and mining cohesive subgraphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Fast Similarity Join Algorithm Using Graphics Processing Units

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
DOULION: counting triangles in massive graphs with a coin

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Bayesian locality sensitive hashing for fast similarity search

Proceedings of the VLDB Endowment
V-SMART-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors

Proceedings of the VLDB Endowment
Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rapidly making sense of, analyzing, and extracting useful information from large and complex data is a grand challenge. A user tasked with meeting this challenge is often befuddled with questions on where and how to begin to understand the relevant characteristics of such data. Real-world problem scenarios often involve scalability limitations and time constraints. In this paper we present an incremental interactive data analysis system as a step to address this challenge. This system builds on recent progress in the fields of interactive data exploration, locality sensitive hashing, knowledge caching, and graph visualization. Using visual clues based on rapid incremental estimates, a user is provided a multi-level capability to probe and interrogate the intrinsic structure of data. Throughout the interactive process, the output of previous probes can be used to construct increasingly tight coherence estimates across the parameter space, providing strong hints to the user about promising analysis steps to perform next. We present examples, interactive scenarios, and experimental results on several synthetic and real-world datasets which show the effectiveness and efficiency of our approach. The implications of this work are quite broad and can impact fields ranging from top-k algorithms to data clustering and from manifold learning to similarity search.