• DocumentCode
    1887745
  • Title

    Considerations for big data: Architecture and approach

  • Author

    Bakshi, Kapil

  • Author_Institution
    Cisco Syst. Inc., Herndon, VA, USA
  • fYear
    2012
  • fDate
    3-10 March 2012
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    The amount of data in our industry and the world is exploding. Data is being collected and stored at unprecedented rates. The challenge is not only to store and manage the vast volume of data (“big data”), but also to analyze and extract meaningful value from it. There are several approaches to collecting, storing, processing, and analyzing big data. The main focus of the paper is on unstructured data analysis. Unstructured data refers to information that either does not have a pre-defined data model or does not fit well into relational tables. Unstructured data is the fastest growing type of data, some example could be imagery, sensors, telemetry, video, documents, log files, and email data files. There are several techniques to address this problem space of unstructured analytics. The techniques share a common characteristics of scale-out, elasticity and high availability. MapReduce, in conjunction with the Hadoop Distributed File System (HDFS) and HBase database, as part of the Apache Hadoop project is a modern approach to analyze unstructured data. Hadoop clusters are an effective means of processing massive volumes of data, and can be improved with the right architectural approach.
  • Keywords
    SQL; data analysis; Apache Hadoop project; HBase database; Hadoop cluster; Hadoop distributed file system; MapReduce; NoSQL; architectural approach; data management; data storage; unstructured data analysis; Availability; Benchmark testing; Computer architecture; Distributed databases; File systems; Relational databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Aerospace Conference, 2012 IEEE
  • Conference_Location
    Big Sky, MT
  • ISSN
    1095-323X
  • Print_ISBN
    978-1-4577-0556-4
  • Type

    conf

  • DOI
    10.1109/AERO.2012.6187357
  • Filename
    6187357