Detecting Transient Bottlenecks in n-Tier Applications through Fine-Grained Analysis

Author

Qingyang Wang ; Kanemasa, Yasuhiko ; Li, Jie ; Jayasinghe, Danushka ; Shimizu, Tsuyoshi ; Matsubara, Masaki ; Kawaba, Motoyuki ; Pu, Calton

Author_Institution

Georgia Inst. of Technol., Atlanta, GA, USA

fYear

2013

fDate

8-11 July 2013

Firstpage

31

Lastpage

40

Abstract

Identifying the location of performance bottlenecks is a non-trivial challenge when scaling n-tier applications in computing clouds. Specifically, we observed that an n-tier application may experience significant performance loss when there are transient bottlenecks in component servers. Such transient bottlenecks arise frequently at high resource utilization and often result from transient events (e.g., JVM garbage collection) in an n-tier system and bursty workloads. Because of their short lifespan (e.g., milliseconds), these transient bottlenecks are difficult to detect using current system monitoring tools with sampling at intervals of seconds or minutes. We describe a novel transient bottleneck detection method that correlates throughput (i.e., request service rate) and load (i.e., number of concurrent requests) of each server in an n-tier system at fine time granularity. Both throughput and load can be measured through passive network tracing at millisecond-level time granularity. Using correlation analysis, we can identify the transient bottlenecks at time granularities as short as 50ms. We validate our method experimentally through two case studies on transient bottlenecks caused by factors at the system software layer (e.g., JVM garbage collection) and architecture layer (e.g., Intel SpeedStep).

Keywords

cloud computing; file servers; resource allocation; architecture layer; component servers; computing clouds; correlation analysis; fine time granularity; fine-grained analysis; millisecond-level time granularity; n-tier applications; passive network tracing; resource utilization; system monitoring tools; system software layer; transient bottleneck detection; transient events; Monitoring; Passive networks; Servers; Throughput; Time factors; Time measurement; Transient analysis; Performance evaluations; Web-facing applications; bottleneck; n-tier system; scalability;

fLanguage

English

Publisher

ieee

Conference_Titel

Distributed Computing Systems (ICDCS), 2013 IEEE 33rd International Conference on

Conference_Location

Philadelphia, PA

ISSN

1063-6927

Type

conf

DOI

10.1109/ICDCS.2013.17

Filename

6681573