Header logo is

Optimal Reinforcement Learning for Gaussian Systems


Conference Paper



The exploration-exploitation trade-off is among the central challenges of reinforcement learning. The optimal Bayesian solution is intractable in general. This paper studies to what extent analytic statements about optimal learning are possible if all beliefs are Gaussian processes. A first order approximation of learning of both loss and dynamics, for nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics, is described by an infinite-dimensional partial differential equation. An approximate finitedimensional projection gives an impression for how this result may be helpful.

Author(s): Hennig, P.
Book Title: Advances in Neural Information Processing Systems 24
Pages: 325-333
Year: 2011
Day: 0
Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger

Department(s): Empirical Inference, Probabilistic Numerics
Bibtex Type: Conference Paper (inproceedings)

Event Name: Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS 2011)
Event Place: Granada, Spain

Digital: 0

Links: PDF


  title = {Optimal Reinforcement Learning for Gaussian Systems},
  author = {Hennig, P.},
  booktitle = {Advances in Neural Information Processing Systems 24},
  pages = {325-333},
  editors = {J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger},
  year = {2011}