Guido Montufar, Assistant Professor
Departments of Mathematics and Statistics, UCLA
Abstract:
Learning with artificial neural networks relies on the complexity of the functions that can be represented by the network and also the particular way it assigns typical parameters to functions of different complexity. For networks with piecewise linear activations, the number of activation regions over the input space is a complexity measure with implications in depth separation, approximation errors, optimization, robustness. In this talk I present recent advances on the maximum and expected complexity of the functions represented by networks with maxout units, which can be regarded as a multi-argument generalization of rectified linear units. In the first part, I present counting formulas and sharp upper bounds for the number of linear regions, with connections to Minkowski sums of polytopes. In the second part, I discuss the behavior for generic parameters and present upper bounds on the expected number of regions given a probability distribution over the parameters, showing that, similar to networks with rectified linear units, for typical parameters the expected complexity of maxout networks can be much lower than the maximum.
This talk is based on joint works with Yue Ren, Leon Zhang, and Hanna Tseran.
Bio:
Guido Montúfar is an Assistant Professor at the Department of Mathematics and the Department of Statistics at UCLA. He studied mathematics and theoretical physics at the TU Berlin and completed the PhD at the Max Planck Institute for Mathematics in the Sciences. Guido is interested in mathematical machine learning, especially the interplay of capacity, optimization, and generalization in deep learning. Since 2018 he is the PI of the ERC starting grant project Deep Learning Theory. His research interfaces with information geometry, optimal transport, and algebraic statistics.