Fragments of order

Authors:
Aristides Gionis;Teija Kujala;Heikki Mannila
Affiliations:
Stanford University, Stanford, CA;University of Helsinki, P.O. Box 26, Teollisuuskatu 23, Helsinki, Finland;University of Helsinki, P.O. Box 26, Teollisuuskatu 23, Helsinki, Finland
Venue:
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2003

Citing 10
Cited 11

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A Spectral Algorithm for Seriation and the Consecutive Ones Problem

SIAM Journal on Computing
Global partial orders from sequential data

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques

Data mining: concepts and techniques
Database Management Systems

Database Management Systems
A simple test for the consecutive ones property

Journal of Algorithms
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Clustering and Identifying Temporal Trends in Document Databases

ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
Linear algorithms to recognize interval graphs and test for the consecutive ones property

STOC '75 Proceedings of seventh annual ACM symposium on Theory of computing

Finding partial orders from unordered 0-1 data

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Efficiently Mining Frequent Closed Partial Orders

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Post sequential patterns mining: a new method for discovering structural patterns

Intelligent information processing II
Algorithms for discovering bucket orders from data

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering Frequent Closed Partial Orders from Strings

IEEE Transactions on Knowledge and Data Engineering
Finding low-entropy sets and trees from binary data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised pattern mining from symbolic temporal data

ACM SIGKDD Explorations Newsletter - Special issue on data mining for health informatics
Compressing large boolean matrices using reordering techniques

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding Total and Partial Orders from Data for Seriation

DS '08 Proceedings of the 11th International Conference on Discovery Science
Learning to order: a relational approach

MCD'07 Proceedings of the 3rd ECML/PKDD international conference on Mining complex data
Finding trees from unordered 0–1 data

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

High-dimensional collections of 0--1 data occur in many applications. The attributes in such data sets are typically considered to be unordered. However, in many cases there is a natural total or partial order ≺ underlying the variables of the data set. Examples of variables for which such orders exist include terms in documents, courses in enrollment data, and paleontological sites in fossil data collections. The observations in such applications are flat, unordered sets; however, the data sets respect the underlying ordering of the variables. By this we mean that if A ≺ B ≺ C are three variables respecting the underlying ordering ≺, and both of variables A and C appear in an observation, then, up to noise levels, variable B also appears in this observation. Similarly, if A1 ≺ A2 ≺ … ≺ Al-1 ≺ Ai is a longer sequence of variables, we do not expect to see many observations for which there are indices i j k such that Ai and Ak occur in the observation but Aj does not.In this paper we study the problem of discovering fragments of orders of variables implicit in collections of unordered observations. We define measures that capture how well a given order agrees with the observed data. We describe a simple and efficient algorithm for finding all the fragments that satisfy certain conditions. We also discuss the sometimes necessary postprocessing for selecting only the best fragments of order. Also, we relate our method with a sequencing approach that uses a spectral algorithm, and with the consecutive ones problem. We present experimental results on some real data sets (author lists of database papers, exam results data, and paleontological data).