DocumentCode
167656
Title
Optimizing the Join Operation on Hive to Accelerate Cross-Matching in Astronomy
Author
Liang Li ; Dixin Tang ; Taoying Liu ; Hong Liu ; Wei Li ; Chenzhou Cui
Author_Institution
Inst. of Comput. Technol., Beijing, China
fYear
2014
fDate
19-23 May 2014
Firstpage
1735
Lastpage
1745
Abstract
Cross-matching in astronomy is a basic procedure for comprehensibly analyzing the relations among different celestial objects. The aim is to search celestial objects in different catalogs and to determine if they are the same object. Basically, cross-matching can be expressed as a join query statement. Since celestial catalogs usually contain billion of stars, the join operator must be carefully designed and optimized for efficiency. In this paper, we focus on fulfilling cross-matching by MapReduce based join operators. The challenge is how to optimize the join operators to satisfy specific requirements of cross-matching. Therefore, we propose an optimized method and investigate its efficiency by theoretical analysis and experiment. Our study shows that the method has a remarkable improvement to previous work, especially when the data is very large.
Keywords
astronomy computing; optimisation; query processing; string matching; MapReduce; astronomy cross-matching; celestial object relations; join operation optimization; join query statement; Conferences; Distributed processing; Astronomy; Cross-Matching; Join; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location
Phoenix, AZ
Print_ISBN
978-1-4799-4117-9
Type
conf
DOI
10.1109/IPDPSW.2014.193
Filename
6969584
Link To Document