Abstract :
Large-scale scientific installations generate voluminous amounts of data (or big data) every day. These data often need to be transferred using high-speed links (typically with 10 Gb/s or more link capacity) to researchers located around the globe for storage and analysis. Efficiently transferring big data across countries or continents requires specialized big data transfer protocols. Several big data transfer protocols have been proposed in the literature, however, a comparative analysis of these protocols over a long distance international network is lacking in the literature. We present a comparative performance and fairness study of three open-source big data transfer protocols, namely, GridFTP, FDT, and UDT, using a 10 Gb/s high-speed link between New Zealand and Sweden. We find that there is limited performance difference between GridFTP and FDT. GridFTP is stable in terms of handling file system and TCP socket buffer. UDT has an implementation issue that limits its performance. FDT has issues with small buffer size limiting its performance, however, this problem is overcome by using multiple flows. Our work indicates that faster file systems and larger TCP socket buffers in both the operating system and application are useful in improving data transfer rates.