Title :
Online reinforcement learning for multimedia buffer control
Author :
Mastronarde, Nicholas ; Van der Schaar, Mihaela
Author_Institution :
Dept. of Electr. Eng., UCLA, Los Angeles, CA, USA
Abstract :
We formulate the multimedia buffer control problem as a Markov decision process. Because the application´s rate-distortion-complexity behavior is unknown a priori, the optimal buffer control policy must be learned online. To this end, we adopt a low complexity reinforcement learning algorithm called Q-learning to learn the optimal control policy at run-time. We propose an accelerated Q-learning algorithm that exploits partial knowledge about the system´s dynamics in order to dramatically improve the performance. In our experiments, we show that the proposed application-aware reinforcement learning algorithm performs significantly better than existing application-independent reinforcement learning algorithms.
Keywords :
Internet; Markov processes; computer aided instruction; learning (artificial intelligence); multimedia computing; optimal control; rate distortion theory; Markov decision process; Q-learning; multimedia buffer control; online reinforcement learning; optimal buffer control; rate-distortion-complexity behavior; Acceleration; Delay; Dynamic voltage scaling; Frequency; Hardware; Learning; Optimal control; Runtime; Streaming media; Voltage control; Markov decision processes; Multimedia buffer control; dynamic voltage scaling; encoder complexity control; reinforcement learning;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495293