DocumentCode :
2437931
Title :
REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems
Author :
Meng, Shicong ; Kashyap, Srinivas R. ; Venkatramani, Chitra ; Liu, Ling
Author_Institution :
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2009
fDate :
22-26 June 2009
Firstpage :
248
Lastpage :
255
Abstract :
To observe, analyze and control large scale distributed systems and the applications hosted on them, there is an increasing need to continuously monitor performance attributes of distributed system and application states. This results in application state monitoring tasks that require fine-grained attribute information to be collected from relevant nodes efficiently. Existing approaches either treat multiple application state monitoring tasks independently and build ad-hoc monitoring trees for each task, or construct a single static monitoring tree for multiple tasks. We argue that a careful planning of multiple application state monitoring tasks by jointly considering multi-task optimization and node level resource constraints can provide significant gains in performance and scalability. In this paper, we present REMO, a REsource-aware application state MOnitoring system. REMO produces a forest of optimized monitoring trees through iterations of two phases, one phase exploring cost sharing opportunities via estimation and the other refining the monitoring plan through resource-sensitive tree construction. Our experimental results include those gathered by deploying REMO on a BlueGene/P rack running IBM´s large-scale distributed streaming system - System S. Using REMO running over 200 monitoring tasks for an application deployed across 200 nodes results in a 35%-45% decrease in the percentage error of collected attributes compared to existing schemes.
Keywords :
distributed processing; iterative methods; optimisation; tree data structures; ad-hoc monitoring tree; fine-grained attribute information; large-scale distributed system; multitask optimization; resource-aware application state monitoring; resource-sensitive tree construction; Constraint optimization; Control system analysis; Control systems; Cost function; Distributed control; Large-scale systems; Monitoring; Performance analysis; Performance gain; Scalability; Continuous Query; Data Stream; Distributed Systems; Overlay; Resource-Aware; State monitoring;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 2009. ICDCS '09. 29th IEEE International Conference on
Conference_Location :
Montreal, QC
ISSN :
1063-6927
Print_ISBN :
978-0-7695-3659-0
Electronic_ISBN :
1063-6927
Type :
conf
DOI :
10.1109/ICDCS.2009.15
Filename :
5158431
Link To Document :
بازگشت