• DocumentCode
    3054396
  • Title

    Towards Reliability and Fault-Tolerance of Distributed Stream Processing System

  • Author

    Gorawski, Marcin ; Marks, Pawel

  • Author_Institution
    Silesian Univ. of Technol., Gliwice
  • fYear
    2007
  • fDate
    14-16 June 2007
  • Firstpage
    246
  • Lastpage
    253
  • Abstract
    Not so long ago data warehouses were used to process data sets loaded periodically. We could distinguish two kinds of ETL processes: full and incremental. Now we often have to process real-time data and analyse them almost on-the-fly, so the analysis are always up to date. There are many possible applications for real-time data warehouses. In most cases two features are important: delivering data to the warehouse as quick as possible, and not losing any tuple in case of failures. In this paper we propose an architecture for gathering and processing data from geographically distributed data sources. We present theoretical analysis, mathematical model of a data source, and some rules of system modules configuration. At the end of the paper our future plans are described briefly.
  • Keywords
    data handling; data warehouses; fault tolerant computing; ETL processes; data delivery; data warehouses; distributed stream processing system; fault tolerance; geographically distributed data sources; system modules configuration; Application software; Computer networks; Computer science; Data analysis; Data warehouses; Energy consumption; Fault tolerant systems; Mathematical model; Meter reading; Patient monitoring;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependability of Computer Systems, 2007. DepCoS-RELCOMEX '07. 2nd International Conference on
  • Conference_Location
    Szklarska
  • Print_ISBN
    0-7695-2850-3
  • Type

    conf

  • DOI
    10.1109/DEPCOS-RELCOMEX.2007.50
  • Filename
    4272916