Using schematically heterogeneous structures
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
EXPRESS: a data EXtraction, Processing, and Restructuring System
ACM Transactions on Database Systems (TODS)
AJAX: an extensible data cleaning tool
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
CONVERT: a high level translation definition language for data conversion
Communications of the ACM
Universality of data retrieval languages
POPL '79 Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Data Mining and Knowledge Discovery
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
Query Optimization in the Presence of Foreign Functions
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient development of data migration transformations
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Proceedings of the 2004 international workshop on Information quality in information systems
Optimizing ETL Processes in Data Warehouses
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data warehouse scenarios for model management
ER'00 Proceedings of the 19th international conference on Conceptual modeling
Data cleaning and transformation using the AJAX framework
GTTSE'05 Proceedings of the 2005 international conference on Generative and Transformational Techniques in Software Engineering
Scheduling strategies for efficient ETL execution
Information Systems
Hi-index | 0.00 |
Transforming data is a fundamental operation in application scenarios involving data integration, legacy data migration, data cleaning, and extract-transform-load processes. Data transformations are often implemented as relational queries that aim at leveraging the optimization capabilities of most RDBMSs. However, relational query languages like SQL are not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. This class of data transformations is required for solving the data heterogeneities that occur when source data represents an aggregation of target data. In this paper, we propose and formally define the data mapper operator as an extension of the relational algebra to address one-to-many data transformations. We supply an algebraic rewriting technique that enables the optimization of data transformation expressions that combine filters expressed as standard relational operators with mappers. Furthermore, we identify the two main factors that influence the expected optimization gains.