Model-based testing in practice
Proceedings of the 21st international conference on Software engineering
Different query verification approaches used to test entity SQL
Proceedings of the 1st international workshop on Testing database systems
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The last decade witnessed the emergence of various distributed storage and computation systems for cloud-scale data processing. Scope is the distributed computation platform targeted for a variety of data analysis and data mining applications, powering Bing and other online services at Microsoft. Scope combines benefits of both traditional parallel databases and MapReduce execution engines to allow easy programmability. It features a SQL-like declarative scripting language with .NET extensions, and delivers massive scalability and high performance through advanced optimization. Scope currently operates over tens of thousands of machines and processes over a million jobs per month. Such massive data computation platform presents new challenges and opportunities for efficient and effective testing and validation. Traditional approaches for testing database systems are not always sufficient due to several factors. Model-based query generation typically fails to provide coverage of user-defined code, which is very common in Scope scripts. Additionally, rapid release cycles in the platform-as-a-service environment require tools to quickly identify potential regressions, predict the impact of breaking changes, and provide massive test coverage in a short amount of time. In this paper, we describe a test automation tool, denoted by Scope Playback, that addresses these new requirements. Scope Playback leverages the Scope system itself in two important ways. First, it exploits data about every job submitted to production clusters, which is automatically stored by the Scope system. Second, the testing process itself is implemented as a Scope script, automatically benefiting from transparent and massive computation parallelism. Scope Playback currently serves as one crucial validation technique and ensures product quality during Scope release cycles.