Distributed Stream Processing Analysis in High Availability Context

Authors:
Marcin Gorawski;Pawel Marks
Affiliations:
Silesian University of Technology, Poland;Silesian University of Technology, Poland
Venue:
ARES '07 Proceedings of the The Second International Conference on Availability, Reliability and Security
Year:
2007

Citing 0
Cited 2

Towards automated analysis of connections network in distributed stream processing system

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Collecting data streams from a distributed radio-based measurement system

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Not so long ago data warehouses were used to process data sets loaded periodically during ETL process (Extraction, Transformation and Loading). We could distinguish two kinds of ETL processes: full and incremental. Now we often have to process real-time data and analyse them almost on-the-fly, so the analyses are always up to date. There are many possible applications for real-time data warehouses. In most cases two features are important: delivering data to the warehouse as quick as possible, and not losing any tuple in case of failures. In this paper we propose an architecture for gathering and processing data from geographically distributed data sources. We present theoretical analysis, mathematical model of a data source, some rules of system modules configuration and results of experiments. At the end of the paper our future plans are described briefly.