I am interested in algorithms that estimate and express uncertainty about the result of imprecise computations. Such imprecision can arise because the computational task is not analytically tractable, because a limited computational budget only allows a partial solution, or because the description of the task is itself imprecise to begin with. Probability measures provide the formal language for the description of such uncertainty. My group and I develop computer algorithms that take in and return probability measures; we call these probabilistic numerical methods.
can be found here. Don't trust it to always be up to date. If you need a bio-blurb for your event web-page or a talk introduction, here's a suggestion (sorry if this sounds like grandstanding, I've repeatedly been asked for such a text):
Philipp Hennig heads the Max Planck Research Group on Probabilistic Numerics at the MPI for Intelligent Systems in Tübingen, Germany. He studied physics in Heidelberg and London, before he moved to Cambridge, UK, where he did his PhD in David MacKay's inference group. Since that time, he is interested in the information content of computations, and mathematical notions of uncertainty for deterministic computation. Together with two colleagues from Oxford and Columbia U, he organized the inaugural workshop on Probabilistic Numerics in 2012, (re-) starting a community effort to provide a rigorous formulation of computation as the collection of information by autonomous, self-consistent agents. His work has been published in the leading venues of machine learning, as well as journals of the applied mathematics community. Together with his group, he has provided novel interpretations of classic numerical algorithms as maximum a-posteriori estimators, and used these results to create new algorithmical tools for machine learning and artificial intelligence.
Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple empirical inspection. The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance of the gradient estimates. This variance also changes over the optimization process; when using a constant batch size, stability and convergence is thus often enforced by means of a (manually tuned) decreasing learning rate schedule. We propose a practical method for dynamic batch size adaptation. It estimates the variance of the stochastic gradients and adapts the batch size to decrease the variance proportionally to the value of the objective function, removing the need for the aforementioned learning rate decrease. In contrast to recent related work, our algorithm couples the batch size to the learning rate, directly reflecting the known relationship between the two. On three image classification benchmarks, our batch size adaptation yields faster optimization convergence, while simultaneously simplifying learning rate tuning. A TensorFlow implementation is available.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), 2016 (conference)
González, J., Dai, Z., Hennig, P., Lawrence, N.
Batch Bayesian Optimization via Local PenalizationProceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), 2016 (conference)
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), 2016 (conference)
Bartels, S., Hennig, P.
Probabilistic Approximate Least-SquaresProceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), 2016 (conference)
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), IEEE International Conference on Robotics and Automation, May 2016 (inproceedings)
This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree- of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four- dimensional tuning problems highlight the method’s potential for automatic controller tuning on robotic platforms.
In Advances in Neural Information Processing Systems 28, pages: 181-189, (Editors: C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama and R. Garnett), Curran Associates, Inc., 29th Annual Conference on Neural Information Processing Systems (NIPS), 2015 (inproceedings)
In deterministic optimization, line searches are a standard tool ensuring stability and efficiency. Where only stochastic gradients are available, no direct equivalent has so far been formulated, because uncertain gradients do not allow for a strict sequence of decisions collapsing the search space. We construct a probabilistic line search by combining the structure of existing deterministic methods with notions from Bayesian optimization. Our method retains a Gaussian process surrogate of the univariate optimization objective, and uses a probabilistic belief over the Wolfe conditions to monitor the descent. The algorithm has very low computational cost, and no user-controlled parameters. Experiments show that it effectively removes the need to define a learning rate for stochastic gradient descent.
[You can find the matlab research code under `attachments' below. The zip-file contains a minimal working example. The docstring in probLineSearch.m contains additional information. A more polished implementation in C++ will be published here at a later point. For comments and questions about the code please write to firstname.lastname@example.org.]
Machine Learning in Planning and Control of Robot Motion Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems, September 2015 (conference)
This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree-of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Preliminary results of a low-dimensional tuning problem highlight the method’s potential for automatic controller tuning on robotic platforms.
Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 471(2179), 2015 (article)
We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.
In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 38, pages: 847-855, JMLR Workshop and Conference Proceedings, (Editors: Lebanon, G. and Vishwanathan, S.V.N.), JMLR.org, AISTATS, 2015 (inproceedings)
In Conference on Pattern Recognition (GCPR), 8753, pages: 331-341, Lecture Notes in Computer Science, (Editors: Jiang, X., Hornegger, J., and Koch, R.), Springer, GCPR, September 2014 (inproceedings)
Predicting the time at which the integral over a stochastic process reaches a target level is a value of interest in many applications. Often, such computations have to be made at low cost, in real time. As an intuitive example that captures many features of this problem class, we choose progress bars, a ubiquitous element of computer user interfaces. These predictors are usually based on simple point estimators, with no error modelling. This leads to fluctuating behaviour confusing to the user. It also does not provide a distribution prediction (risk values), which are crucial for many other application areas. We construct and empirically evaluate a fast, constant cost algorithm using a Gauss-Markov process model which provides more information to the user.
In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 33, pages: 347-355, JMLR: Workshop and Conference Proceedings, (Editors: S Kaski and J Corander), Microtome Publishing, Brookline, MA, AISTATS, April 2014 (inproceedings)
We study a probabilistic numerical method for the solution of both
boundary and initial value problems that returns a joint Gaussian
process posterior over the solution. Such methods have concrete value
in the statistics on Riemannian manifolds, where non-analytic ordinary
differential equations are involved in virtually all computations. The
probabilistic formulation permits marginalising the uncertainty of the
numerical solution such that statistics are less sensitive to
inaccuracies. This leads to new Riemannian algorithms for mean value
computations and principal geodesic analysis. Marginalisation also
means results can be less precise than point estimates, enabling a
noticeable speed-up over the state of the art. Our approach is an
argument for a wider point that uncertainty caused by numerical
calculations should be tracked throughout the pipeline of machine
In Advances in Neural Information Processing Systems 27, pages: 739-747, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)
In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 230-239, (Editors: NL Zhang and J Tian), AUAI Press , Corvallis, Oregon, UAI2014, 2014, another link: http://arxiv.org/abs/1310.6740 (inproceedings)
In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, Lecture Notes in Computer Science Vol. 8675, pages: 265-272, (Editors: P. Golland, N. Hata, C. Barillot, J. Hornegger and R. Howe), Springer, Heidelberg, MICCAI, 2014 (inproceedings)
Abstract: Locally weighted regression was created as a nonparametric learning method that is computationally efficient, can learn from very large amounts of data and add data incrementally. An interesting feature of locally weighted regression is that it can work with ...
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems