DocumentCode :
2721460
Title :
Blind men and an elephant coalescing open-source, academic, and industrial perspectives on BigData
Author :
Douglas, Chris ; Curino, Carlo
Author_Institution :
Cloud & Inf. Services Lab., Microsoft, Redmond, WA, USA
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
1523
Lastpage :
1526
Abstract :
This tutorial is organized in two parts. In the first half, we will present an overview of applications and services in the BigData ecosystem. We will use known distributed database and systems literature as landmarks to orient the attendees in this fast-evolving space. Throughout, we will contrast models of resource management, performance, and the constraints that shape the architectures of prominent systems. We will also discuss the role of academia and industry in the development of open-source infrastructure, with an emphasis on open problems and strategies for collaboration. We assume only basic familiarity with distributed systems. In the second half, we will delve into Apache Hadoop YARN. YARN (Yet Another Resource Negotiator) transformed Hadoop from a MapReduce engine to a general-purpose cluster scheduler. Since its introduction, it has been deployed in production and extended to support use cases beyond large-scale batch processing. The tutorial will present the active research and development supporting such heterogeneous workloads, with particular attention to multi-tenant scheduling. Topics include security, resource isolation, protocols, and preemption. This portion will be detailed, but accessible to anyone with a background in distributed systems and all attendees of the first half of the tutorial.
Keywords :
Big Data; batch processing (computers); data handling; distributed databases; parallel processing; public domain software; Apache Hadoop YARN; BigData ecosystem; MapReduce engine; distributed database; general-purpose cluster scheduler; large-scale batch processing; multitenant scheduling; open-source; resource management; yet another resource negotiator; Communities; Databases; Ecosystems; Engines; Resource management; Tutorials; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113417
Filename :
7113417
Link To Document :
بازگشت