An Enhanced Extract-Transform-Load System for Migrating Data in Telecom Billing

  • Authors:
  • Himanshu Agrawal;Girish Chafle;Sunil Goyal;Sumit Mittal;Sougata Mukherjea

  • Affiliations:
  • IBM India Research Laboratory, New Delhi, India. hiagrawa@in.ibm.com;IBM India Research Laboratory, New Delhi, India. cgirish@in.ibm.com;IBM India Research Laboratory, New Delhi, India. gsunil@in.ibm.com;IBM India Research Laboratory, New Delhi, India. sumittal@in.ibm.com;IBM India Research Laboratory, New Delhi, India. smukherj@in.ibm.com

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data migration has become a priority in many industries, spawned by a variety of business needs. Most of the existing tools for Extract, Transform and Load (ETL) process of data migration are piece-meal and do not present a complete solution. Moreover, while research has focused on the problem of Schema Mapping, a key step in the ETL process, most of the current algorithms do not perform well on real-world data. Researchers have suggested the use of Domain Knowledge to enhance schema mapping. In this paper, we use domain knowledge in an innovative manner to improve schema mapping in an `actual' industrial setting. Further, we take a comprehensive view of the data migration problem and present an end-to-end system for the ETL process, utilizing existing tools for each step and building connectors, wherever required. We focus on Data Migration for Telecom Billing and utilize domain knowledge captured in an ontology, a thesaurus and a set of rules to improve schema mapping. Experiments conducted on a real-life data demonstrate the effectiveness of our system and validate the utility of domain knowledge in data migration projects.