Significance information for translation: air quality data integration

  • Authors:
  • Andrew Philpot;Patrick Pantel;Eduard Hovy

  • Affiliations:
  • USC Information Sciences Institute, Marina del Rey, CA;USC Information Sciences Institute, Marina del Rey, CA;USC Information Sciences Institute, Marina del Rey, CA

  • Venue:
  • dg.o '05 Proceedings of the 2005 national conference on Digital government research
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The management of air quality involves local, state, regional, national, and international organizations. At each level, data are collected and used for analysis, assessment, and regulatory enforcement. Effective air quality management requires coordination among multiple organizations and, therefore, requires integration among their respective data sets. This integration remains a complex IT challenge due to the variety of collection, storage, format, and dissemination methods employed by each organization. In most cases today when organizations need to share data, specialized arrangements and significant manual effort are required to create usable mappings between the data sources. More general methods are required to bridge the gap between representations and organization schemes. We present an interactive web-based demo of our preliminary work adapting the statistical alignment and clustering methods from cross-language statistical machine translation. Using the demo, users can discover new relations and test likely candidate relations between two similar data sources from local and California state air quality management agencies.