Aggregating time partitions

Authors:
Taneli Mielikäinen;Evimaria Terzi;Panayiotis Tsaparas
Affiliations:
Helsinki Institute for Information Technology, University of Helsinki, Finland;Helsinki Institute for Information Technology, University of Helsinki, Finland;Helsinki Institute for Information Technology, University of Helsinki, Finland
Venue:
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2006

Citing 25
Cited 6

Bagging predictors

Machine Learning
Temporal sequence learning and data reduction for anomaly detection

ACM Transactions on Information and System Security (TISSEC)
The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision
On the approximation of curves by line segments using dynamic programming

Communications of the ACM
DNA segmentation as a model selection process

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Data-streams and histograms

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A framework for constructing features and models for intrusion detection systems

ACM Transactions on Information and System Security (TISSEC)
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Enhancing profiles for anomaly detection using time granularities

Journal of Computer Security
Logistic Regression, AdaBoost and Bregman Distances

Machine Learning
Finding recurrent sources in sequences

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
An Online Algorithm for Segmenting Time Series

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Time Series Segmentation for Context Recognition in Mobile Devices

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Bursty and Hierarchical Structure in Streams

Data Mining and Knowledge Discovery
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
An Overview of Haplotyping via Perfect Phylogeny: Theory, Algorithms and Programs

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Clustering Aggregation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Comparing and aggregating rankings with ties

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Reliable detection of episodes in event sequences

Knowledge and Information Systems
Aggregating inconsistent information: ranking and clustering

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Combining partitions by probabilistic label aggregation

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Clustering aggregation

ACM Transactions on Knowledge Discovery from Data (TKDD)
A fuzzy-driven genetic algorithm for sequence segmentation applied to genomic sequences

Applied Soft Computing
Topic dynamics: an alternative model of bursts in streams of topics

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A new class of attacks on time series data mining\m{1}

Intelligent Data Analysis
Evaluation of BIC and Cross Validation for model selection on sequence segmentations

International Journal of Data Mining and Bioinformatics
Preserving Privacy in Time Series Data Mining

International Journal of Data Warehousing and Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partitions of sequential data exist either per se or as a result of sequence segmentation algorithms. It is often the case that the same timeline is partitioned in many different ways. For example, different segmentation algorithms produce different partitions of the same underlying data points. In such cases, we are interested in producing an aggregate partition, i.e., a segmentation that agrees as much as possible with the input segmentations. Each partition is defined as a set of continuous non-overlapping segments of the timeline. We show that this problem can be solved optimally in polynomial time using dynamic programming. We also propose faster greedy heuristics that work well in practice. We experiment with our algorithms and we demonstrate their utility in clustering the behavior of mobile-phone users and combining the results of different segmentation algorithms on genomic sequences.