Title :
Software system performance debugging with kernel events feature guidance
Author :
Junghwan Rhee ; Hui Zhang ; Arora, Nipun ; Guofei Jiang ; Yoshihira, K.
Author_Institution :
NEC Labs. America, Princeton, NJ, USA
Abstract :
To diagnose performance problems in production systems, many OS kernel-level monitoring and analysis tools have been proposed. Using low level kernel events provides benefits in efficiency and transparency to monitor application software. On the other hand, such approaches miss application-specific semantic information which can be effective to differentiate the trace patterns from distinct application logic. This paper introduces new trace analysis techniques based on event features to improve kernel event based performance diagnosis tools. Our prototype, AppDiff, is based on two analysis features: system resource features convert kernel events to resource usage metrics, thereby enabling the detection of various performance anomalies in a unified way; program behavior features infer the application logic behind the low level events. By using these features and conditional probability, AppDiff can detect outliers and improve the diagnosis of application performance.
Keywords :
operating system kernels; probability; production engineering computing; program debugging; program diagnostics; software metrics; AppDiff; OS kernel-level analysis tools; OS kernel-level monitoring tools; application logic; application performance diagnosis improvement; application software monitoring; black-box monitoring; conditional probability; kernel event based performance diagnosis tool improvement; kernel events feature guidance; miss application-specific semantic information; outlier detection; performance anomaly detection; performance problem diagnosis; production systems; program behavior features; resource usage metrics; software system performance debugging; system management; system resource features; trace analysis techniques; trace patterns; Feature extraction; Kernel; Monitoring; Servers; Software systems; Training; black-box monitoring; distributed systems; kernel events monitoring; performance debugging; system management;
Conference_Titel :
Network Operations and Management Symposium (NOMS), 2014 IEEE
Conference_Location :
Krakow
DOI :
10.1109/NOMS.2014.6838353