Outlier detection in relational data: A case study in geographical information systems

  • Authors:
  • Joris Maervoet;Celine Vens;Greet Vanden Berghe;Hendrik Blockeel;Patrick De Causmaecker

  • Affiliations:
  • KaHo Sint-Lieven, Computer Science, CODeS, Gebr. Desmetstraat 1, 9000 Ghent, Belgium and K.U.Leuven-Kulak, Department of Computer Science, CODeS, Etienne Sabbelaan 53, 8500 Kortrijk, Belgium;K.U.Leuven, Department of Computer Science, Celestijnenlaan 200 A, 3001 Leuven, Belgium;KaHo Sint-Lieven, Computer Science, CODeS, Gebr. Desmetstraat 1, 9000 Ghent, Belgium;K.U.Leuven, Department of Computer Science, Celestijnenlaan 200 A, 3001 Leuven, Belgium;K.U.Leuven-Kulak, Department of Computer Science, CODeS, Etienne Sabbelaan 53, 8500 Kortrijk, Belgium and ITEC - IBBT - K.U.Leuven

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

Geographical information systems are commonly used for a variety of purposes. Many of them make use of a large database of geographical data, the correctness of which strongly influences the reliability of the system. In this paper, we present an approach to quality maintenance that is based on automatic discovery of non-perfect regularities in the data. The underlying idea is that exceptions to these regularities ('outliers') are considered probable errors in the data, to be investigated by a human expert. A case study shows how the tool can be used for extracting valuable knowledge about outliers in real-world geographical data, in an adaptive manner to the evolving data model supporting it. While the tool aims specifically at geographical information systems, the underlying approach is more broadly applicable for quality maintenance in data-rich intelligent systems.