Mining census data for spatial effects on mortality

  • Authors:
  • Willi Klö/sgen;Michael May;Jim Petch

  • Affiliations:
  • Fraunhofer Institute for Autonomous Intelligent Systems, Knowledge Discovery Team, D-53757 Sankt Augustin, Germany. Tel.: +49 2241 142723/ Fax: +49 2241 142072/ E-mail: {kloesgen, may}@ais.fhg.de;Fraunhofer Institute for Autonomous Intelligent Systems, Knowledge Discovery Team, D-53757 Sankt Augustin, Germany. Tel.: +49 2241 142723/ Fax: +49 2241 142072/ E-mail: {kloesgen, may}@ais.fhg.de;University of Manchester, Distributed Learning, Oxford Road, Manchester M13 9PL, UK. E-mail: Jim.Petch@man.ac.uk

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper describes a system for spatial data mining illustrating its features by an application to spatial census data. Using census data for data mining includes specific challenges. Because of data privacy regulations, census data are generally available for analysis only in aggregated form. Primary data (responses of persons) are aggregated in many cross tabulations for small geographical units. Thus the target objects of secondary analysis are small areas (enumeration districts or wards). Any cell or marginal of a cross tabulation can be used as variable on these target objects. The target objects can be linked with other spatial objects (e.g. rivers, roads, railway lines) for spatial analyses. In this paper we discuss the special problems that occur for this type of aggregate data mining including spatial analyses. We show an application of SubgroupMiner, which is an advanced subgroup mining system supporting multirelational hypotheses, efficient data base integration, discovery of causal subgroup structures, and visualization based interaction options. The application explores if transportation lines (e.g. roads, railway lines) increase mortality for those persons that live near such objects because of a possible higher occurrence of some disease.