DocumentCode :
1899431
Title :
Efficient handling of heterogeneous file formats in HDFS
Author :
Prashant, More Vaishali ; Raut, Suhas D.
Author_Institution :
Dept. of Comput. Sci. & Eng., N.K. Orchid Coll. of Eng. & Tech, Solapur, India
fYear :
2015
fDate :
5-7 March 2015
Firstpage :
1
Lastpage :
6
Abstract :
The amount of data in our industry and the world is exploding. Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. In an Organization, there are multiple types of documents collected from the different sources. This documents that needs to be accessible immediately; documents that needs to be accessed within a few seconds or minutes; and documents that is accessed in frequently. While these types of documents play different roles within an organization, each is valuable. These different types of documents require different kinds of storage solutions. For handling of such heterogeneous file format we use Hadoop. In Hadoop, storage of different documents is provided by HDFS (Hadoop Distributed File System). Also in educational organization, documents categorization is one of the most important tasks. Availability of a document and need of providing a category to a document motivated for implementing this project.
Keywords :
Big Data; distributed databases; document handling; storage management; Big Data; HDFS; Hadoop distributed file system; data availability; data exponential growth; documents access; documents categorization; educational organization; heterogeneous file format handling; storage solutions; Random access memory; Software; Big Data; HDFS; Hadoop;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical, Computer and Communication Technologies (ICECCT), 2015 IEEE International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4799-6084-2
Type :
conf
DOI :
10.1109/ICECCT.2015.7226034
Filename :
7226034
Link To Document :
بازگشت