• DocumentCode
    3575233
  • Title

    The Berkeley Data Analytics Stack (BDAS)

  • Author

    Jayati

  • Author_Institution
    Impetus Infotech Pvt. Ltd., Indore, India
  • fYear
    2014
  • Firstpage
    1
  • Lastpage
    1
  • Abstract
    Summary form only given. The session on “The Berkeley Data Analytics Stack” shall elucidate its current components which include Spark, Shark and Mesos with emphasis on Spark and it´s real-time extension called Spark-Streaming which adds stream processing capabilities to Spark. One-liners describing each of these technologies are as follows: 1) BDAS is an open source, next-generation data analytics stack under development at the UC Berkeley AMPLab. 2) Spark, a high-speed cluster computing system compatible with Hadoop that can outperform it by up to 100x thanks to its ability to perform computations in memory. 3) Shark, a port of Apache Hive onto Spark that is compatible with existing Hive warehouses and queries. Shark can answer HiveQL queries up to 100x faster than Hive without modification to the data and queries, and is also open source as part of BDAS. 4) Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications or frameworks. It can run Hadoop, MPI, Hypertable, Spark, and other applications on a dynamically shared pool of nodes. 5) Apart, from an elaborate explanation of various facets of Spark, the session would also aim to walk through machine learning algorithm benchmarking and examples that would substantiate the concepts covered.
  • Keywords
    application program interfaces; data analysis; data warehouses; learning (artificial intelligence); message passing; pattern clustering; public domain software; query processing; Apache Hive; BDAS; Berkeley data analytics stack; Hadoop; Hive queries; Hive warehouses; HiveQL queries; Hypertable; MPI; Mesos cluster manager; Shark; Spark-streaming; high-speed cluster computing system; machine learning algorithm benchmarking; open source next-generation data analytics stack; Abstracts; Sparks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    IT in Business, Industry and Government (CSIBIG), 2014 Conference on
  • Print_ISBN
    978-1-4799-3063-0
  • Type

    conf

  • DOI
    10.1109/CSIBIG.2014.7056925
  • Filename
    7056925