Algorithms for clustering data
Algorithms for clustering data
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Multi-level organization and summarization of the discovered rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Distributed web log mining using maximal large item sets
Knowledge and Information Systems
Advances in Automatic Text Summarization
Advances in Automatic Text Summarization
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries
Data Mining and Knowledge Discovery
Mining All Non-derivable Frequent Itemsets
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Learning nonstationary models of normal network traffic for detecting novel attacks
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining and summarizing customer reviews
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining condensed frequent-pattern bases
Knowledge and Information Systems
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
On efficiently summarizing categorical databases
Knowledge and Information Systems
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Bipartite Graph Framework for Summarizing High-Dimensional Binary, Categorical and Numeric Data
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Anomaly extraction in backbone networks using association rules
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Graph OLAP: a multi-dimensional framework for graph data analysis
Knowledge and Information Systems
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
Summarizing transactional databases with overlapped hyperrectangles
Data Mining and Knowledge Discovery
Model order selection for boolean matrix factorization
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Data summarization model for user action log files
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part III
Anomaly extraction in backbone networks using association rules
IEEE/ACM Transactions on Networking (TON)
Data summarization for network traffic monitoring
Journal of Network and Computer Applications
Hi-index | 0.00 |
In this paper, we formulate the problem of summarization of a data set of transactions with categorical attributes as an optimization problem involving two objective functions – compaction gain and information loss. We propose metrics to characterize the output of any summarization algorithm. We investigate two approaches to address this problem. The first approach is an adaptation of clustering and the second approach makes use of frequent itemsets from the association analysis domain. We illustrate one application of summarization in the field of network data where we show how our technique can be effectively used to summarize network traffic into a compact but meaningful representation. Specifically, we evaluate our proposed algorithms on the 1998 DARPA Off-Line Intrusion Detection Evaluation data and network data generated by SKAION Corp for the ARDA information assurance program.