Title :
Request Squeezer: Mitigating Tail Latency through Pruned Request Replication
Author :
Zuowei Zhang;Hailong Yang;Zhongzhi Luan;Depei Qian
Author_Institution :
Sino-German Joint Software Inst., Beihang Univ., Beijing, China
Abstract :
Modern Internet services are computing over largescaledataset and responding to user requests instantly. To deliversatisfactory user experience, the tail latency of the servicesought to be man-aged within the service level agreement (SLA). Existing techniques for mitigating tail latency launch multi replicasof each request to different machines and use the result ofthe one that finishes first. However, depending on the systemutilization, a portion of replicas violates SLA even before runningand thus wastes resource unnecessarily when executed. These unnecessary replicas further delays more subsequent replicas, dragging the tail latency below SLA target, especially underhigh system utilization. We present Request Squeezer, a methodology for mitigatingtail latency through pruned request replication. For each replica, Request Squeezer leverages the queuing and service time to predictthe latency and terminates the replica missing the SLA target. In case that all replicas are pruned, certain replicas aremarked as survivor, which are immune to pruning technique. Evaluation with Google Web Search workload shows that ourapproach saves 11.6% of resource and improves 25.9% of themaximum throughput while meeting the same SLA comparedwith the state-of-the-art request replication techniques.
Keywords :
"Throughput","Web search","Google","Delays","Web and internet services","Probabilistic logic","Runtime"
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
DOI :
10.1109/HPCC-CSS-ICESS.2015.123