On the approximation of curves by line segments using dynamic programming
Communications of the ACM
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Number Theory Helps Line Detection in Digital Images
ISAAC '93 Proceedings of the 4th International Symposium on Algorithms and Computation
Using GPS to learn significant locations and predict movement across multiple users
Personal and Ubiquitous Computing
On coresets for k-means and k-median clustering
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On-line data reduction and the quality of history in moving objects databases
MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
Spatio-temporal data reduction with deterministic error bounds
The VLDB Journal — The International Journal on Very Large Data Bases
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields
International Journal of Robotics Research
Trajectory clustering: a partition-and-group framework
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
On-line discovery of hot motion paths
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Hidden Markov map matching through noise and sparseness
Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Streaming Algorithms for Line Simplification
Discrete & Computational Geometry
The web as a graph: measurements, models, and methods
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Usability analysis of compression algorithms for position data streams
Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
A unified framework for approximating and clustering data
Proceedings of the forty-third annual ACM symposium on Theory of computing
Coresets for discrete integration and clustering
FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Semantic trajectory mining for location prediction
Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
An effective coreset compression algorithm for large scale sensor networks
Proceedings of the 11th international conference on Information Processing in Sensor Networks
Compression of GPS Trajectories
DCC '12 Proceedings of the 2012 Data Compression Conference
iDiary: from GPS signals to a text-searchable diary
Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems
Hi-index | 0.00 |
We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applications include compression, denoising, activity recognition, road matching, and map generation. We encode these problems as (k, m)-segment mean problems. Formally, we provide (1 + ε)-approximations to the k-segment and (k, m)-segment mean of a d-dimensional discrete-time signal. The k-segment mean is a k-piecewise linear function that minimizes the regression distance to the signal. The (k,m)-segment mean has an additional constraint that the projection of the k segments on Rd consists of only m ≤ k segments. Existing algorithms for these problems take O(kn2) and nO(mk) time respectively and O(kn2) space, where n is the length of the signal. Our main tool is a new coreset for discrete-time signals. The coreset is a smart compression of the input signal that allows computation of a (1 + ε)-approximation to the k-segment or (k,m)-segment mean in O(n log n) time for arbitrary constants ε,k, and m. We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n. We provide empirical evaluations of the quality of our coreset and experimental results that show how our coreset boosts both inefficient optimal algorithms and existing heuristics. We demonstrate our results for extracting signals from GPS traces. However, the results are more general and applicable to other types of sensors.