Automatic generation of overview timelines
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Improving text categorization methods for event tracking
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
On the approximation of curves by line segments using dynamic programming
Communications of the ACM
DNA segmentation as a model selection process
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Temporal summaries of new topics
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Finding simple intensity descriptions from event sequence data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
Finding recurrent sources in sequences
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Partially Periodic Event Patterns with Unknown Periods
Proceedings of the 17th International Conference on Data Engineering
An Online Algorithm for Segmenting Time Series
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
A Linear Time Algorithm for Finding All Maximal Scoring Subsequences
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
MDL learning of unions of simple pattern languages from positive examples
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Pattern discovery in sequences under a Markov assumption
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Mining of Partial Periodic Patterns in Time Series Database
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
A System for new event detection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Bursty and Hierarchical Structure in Streams
Data Mining and Knowledge Discovery
BRAID: stream mining through group lag correlations
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Optimal multi-scale patterns in time series streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Constraint-based sequential pattern mining: the pattern-growth methods
Journal of Intelligent Information Systems
Exploiting duality in summarization with deterministic guarantees
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
EventSummarizer: a tool for summarizing large event sequences
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Proceedings of the 20th ACM international conference on Information and knowledge management
Understanding user behavior through summarization of window transition logs
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
Data summarization model for user action log files
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part III
Summarizing clinical pathways from event logs
Journal of Biomedical Informatics
Finding progression stages in time-evolving event sequences
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Event sequences capture system and user activity over time. Prior research on sequence mining has mostly focused on discovering local patterns appearing in a sequence. While interesting, these patterns do not give a comprehensive summary of the entire event sequence. Moreover, the number of patterns discovered can be large. In this article, we take an alternative approach and build short summaries that describe an entire sequence, and discover local dependencies between event types. We formally define the summarization problem as an optimization problem that balances shortness of the summary with accuracy of the data description. We show that this problem can be solved optimally in polynomial time by using a combination of two dynamic-programming algorithms. We also explore more efficient greedy alternatives and demonstrate that they work well on large datasets. Experiments on both synthetic and real datasets illustrate that our algorithms are efficient and produce high-quality results, and reveal interesting local structures in the data.