Seminars

Join our seminars mailing list by clicking here.

Tuesday, 2/21/2017, 2:00 PM—3:00 PM

Matrix Completion, Saddlepoints, and Gradient Descent

Physics and Astronomy Building 1434A
Alekh Agarwal
Microsoft Research

This talk considers a core question in reinforcement learning (RL): How can we tractably solve sequential decision making problems where the learning agent receives rich observations?

We begin with a new model called Contextual Decision Processes (CDPs) for studying such problems, and show that it encompasses several prior setups to study RL such as MDPs and POMDPs. Several special cases of CDPs are, however, known to be provably intractable in their sample complexities. To overcome this challenge, we further propose a structural property of such processes, called the Bellman Rank. We find that the Bellman Rank of a CDP (and an associated class of functions) provides an intuitive measure of the hardness of a problem in terms of sample complexity—that is the number of samples needed by an agent to discover a near optimal policy for the CDP. In particular, we propose an algorithm, whose sample complexity scales with the Bellman Rank of the process, and is completely independent of the size of the observation space of the agent unlike most prior results. We also show that our techniques are robust to our modeling assumptions, and make connections to several known results as well as highlight novel consequences of our results.

This talk is based on joint work with Nan Jiang, Akshay Krishnamurthy, John Langford and Rob Schapire.

Tuesday, 2/28/2017, 2:00 PM—3:00 PM

Topic: TBA

Physics and Astronomy Building 1434A
Xiaodong Li
Department of Statistics
UC Davis

Abstract: TBA

Tuesday, 3/7/2017, 2:00 PM—3:00 PM

Topic: TBA

Physics and Astronomy Building 1434A
Zongming Ma
Department of Statistics
University of Pennsylvania

Abstract: TBA

Tuesday, 3/14/2017, 2:00 PM—3:00 PM

Topic: TBA

Physics and Astronomy Building 1434A
Emilio Ferrara
Department of Mathematics
USC

Abstract: TBA