Nestedness and segmented nestedness

Authors:
Heikki Mannila;Evimaria Terzi
Affiliations:
University of Helsinki;IBM Almaden Research Center
Venue:
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2007

Citing 5
Cited 5

A Spectral Algorithm for Seriation and the Consecutive Ones Problem

SIAM Journal on Computing
Spectral partitioning works: planar graphs and finite element meshes

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Segmentation problems

Journal of the ACM (JACM)
Linearized cluster assignment via spectral ordering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Assessing data mining results via swap randomization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Banded structure in binary matrices

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximating the Minimum Chain Completion problem

Information Processing Letters
Short communication: Weighted-Interaction Nestedness Estimator (WINE): A new estimator to calculate over frequency matrices

Environmental Modelling & Software
Using background knowledge to rank itemsets

Data Mining and Knowledge Discovery
Subexponential parameterized algorithm for minimum fill-in

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Consider each row of a 0-1 dataset as the subset of the columns for which the row has an 1. Then a dataset is nested, if for all pairs of rows one row is either a superset or subset of the other. The concept of nestedness has its origins in ecology, where approximate versions of it has been used to model the species distribution in different locations. We argue that nestedness and its extensions are interesting properties of datasets, and that they can be applied also to domains other than ecology. We first define natural measures of nestedness and study their properties. We then define the concept of k-nestedness: a dataset is (almost) k-nested if the set of columns can be partitioned to k parts so that each part is (almost) nested. We consider the algorithmic problems of computing how far a dataset is from being k-nested, and for finding a good partition of the columns into k parts. The algorithms are based on spectral partitioning, and scale to moderately large datasets. We apply the methods to real data from ecology and from other applications, and demonstrate the usefulness of the concept.