A general framework for subspace detection in unordered multidimensional data

  • Authors:
  • Leandro A. F. Fernandes;Manuel M. Oliveira

  • Affiliations:
  • Instituto de Computação, Universidade Federal Fluminense (UFF), CEP 24210-240 Niterói, RJ, Brazil;Instituto de Informática, Universidade Federal do Rio Grande do Sul (UFRGS), CP 15064 CEP 91501-970, Porto Alegre, RS, Brazil

  • Venue:
  • Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

The analysis of large volumes of unordered multidimensional data is a problem confronted by scientists and data analysts every day. Often, it involves searching for data alignments that emerge as well-defined structures or geometric patterns in datasets. For example, straight lines, circles, and ellipses represent meaningful structures in data collected from electron backscatter diffraction, particle accelerators, and clonogenic assays. Also, customers with similar behavior describe linear correlations in e-commerce databases. We describe a general approach for detecting data alignments in large unordered noisy multidimensional datasets. In contrast to classical techniques such as the Hough transforms, which are designed for detecting a specific type of alignment on a given type of input, our approach is independent of the geometric properties of the alignments to be detected, as well as independent of the type of input data. Thus, it allows concurrent detection of multiple kinds of data alignments, in datasets containing multiple types of data. Given its general nature, optimizations developed for our technique immediately benefit all its applications, regardless the type of input data.