XML based framework for ETL processes for relational databases

  • Authors:
  • Tassawar Iqbal;Nadeem Daudpota

  • Affiliations:
  • Department of Computer Science, COMSATS Institute of Information Technology, Abbottabad, NWFP, Pakistan;Department of Computer Science, COMSATS Institute of Information Technology, Abbottabad, NWFP, Pakistan

  • Venue:
  • ACOS'06 Proceedings of the 5th WSEAS international conference on Applied computer science
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Data Warehousing, Extraction-Transformation-Loading (ETL) are the key tasks that are responsible for the extraction of data from several sources, their cleansing, customization and insertion into data warehouse [10]. More specifically ETL tools are category of specialized tools with the task of dealing with data warehouse cleaning and loading problems. These task are very critical in every data warehouse environment, It is observed that ETL and data cleaning tools are estimated to cost at least one third of effort and expenses in the budget of the data warehouse [1,11], another evidence shows that ETL process costs 55% of the total cost of the data warehouse [1,12]. In this paper, we focus on the problem of the definition of ETL processes using xml in order to make this framework more generic and capable to deal with heterogeneous source systems. We described the framework that extract data from various heterogeneous source systems and carry it in xml files, later on data cleaning is performed using few predefined xml templates, predefined functions and ultimately data is loaded into data warehouse as per warehouse schema.