DocumentCode
1962881
Title
A system of job log analyzing for Hadoop
Author
Zhao Xiaogang ; Ma Zhiqiang ; Ding Ling ; Liu Xu
Author_Institution
Dept. of Software Eng., Wuhan Univ., Wuhan, China
Volume
3
fYear
2012
fDate
20-21 Oct. 2012
Firstpage
238
Lastpage
243
Abstract
Handling the huge amount of history logs produced by Hadoop distributed computing platform is really a troublesome task and always these history files looks useless. But if we want find out the health degree of cluster platform we must analyze the huge history logs produced by the running jobs. It seems that a single-machine analyzing program cannot satisfy you because of its low speed, high demand of memory and CPU. In this thesis we tried to solve this problem in a distributed way with the Map/Reduce calculation model. We also built a data platform(hive and MySQL) to store these data. From the experiment we can see the distributed way to process log files get good performance when data log files are huge.
Keywords
SQL; distributed processing; program diagnostics; storage management; CPU; Hadoop distributed computing platform; Map/Reduce calculation model; MySQL; a data platform; cluster platform; data log files; data storage; distributed problem solving; health degree; history files; history logs; hive; job log analyzing system; memory; process log files; running jobs; single-machine analyzing program; Blogs; Databases; History; Hadoop; Map/Reduce; distribute computing; history log;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Management, Innovation Management and Industrial Engineering (ICIII), 2012 International Conference on
Conference_Location
Sanya
Print_ISBN
978-1-4673-1932-4
Type
conf
DOI
10.1109/ICIII.2012.6339963
Filename
6339963
Link To Document