Upcoming Seminars

Other UCLA departments frequently hold seminars related to Statistics and of likely of interest to our members. Here is the link to UCLA Biostatistics seminars: https://www.biostat.ucla.edu/events

How to Subscribe to the UCLA Statistics Seminars Mailing List

Join the UCLA Statistics seminars mailing list by sending an email to sympa@sympa.it.ucla.edu with “subscribe stat_seminars” (without quotation marks) in the subject field and the message body blank. This needs to be done from the address that is to be subscribed. After doing that please respond to the email that you receive. An automated email will be sent which confirms that you have been added.

How to Unsubscribe from the UCLA Statistics Seminars Mailing List

You may be receiving our seminar emails because you are directly subscribed to our seminars mailing list (or you may be one of our graduate students, undergraduate students, faculty, etc. and are subscribed to a different mailing list that also receives the seminar emails). If you are the former then you may unsubscribe from the seminar mailing list by sending an email to sympa@sympa.it.ucla.edu with “unsubscribe stat_seminars” (without quotation marks) in the subject field and the message body blank. This needs to be done from the address that is subscribed. After sending that email please follow the directions in the email response that you receive.

Viewing our Seminars Remotely

When viewing one of our live seminars remotely, it is optimal to have your Zoom settings such that you are using “Side-by-side: Speaker View”. You can see details of how to do this here.

Tuesday, 01/31/2023, Time: 11:00am – 12:15pm PT
Iterative Proximal Algorithms for Parsimonious Estimation

Alfonso Landeros, Postdoctoral Scholar
Computational Medicine, UCLA

Mathematical Sciences 8359

Abstract:

Statistical methods often involve solving an optimization problem, such as in maximum likelihood estimation and regression. The addition of constraints, either to enforce a hard requirement in estimation or to regularize solutions, complicates matters. Fortunately, the rich theory of convex optimization provides ample tools for devising novel methods. In this talk, I present applications of distance-to-set penalties to statistical learning problems. Specifically, I will focus on proximal distance algorithms, based on the MM principle, tailored to various applications such as regression and discriminant analysis. Emphasis is given to sparsity set constraints as a compromise between exhaustive combinatorial searches and lasso penalization methods that induce shrinkage.

Bio:

Alfonso Landeros is a postdoctoral scholar at the University of California, Los Angeles, advised by Kenneth Lange. A Bruin for most of his adult life, he obtained his Ph.D. at UCLA in March 2021 under the supervision of Kenneth Lange and Dr. Mary Sehl. Before that, he completed a B.S. in Mathematics/Applied Science at UCLA in June 2013. His research interests span applied probability, mathematical optimization, computational statistics, and their applications to questions in genomics, cancer, epidemiology, and immunology.

Thursday, 02/02/2023, Time: 11:00am – 12:15pm PT
To Adjust or not to Adjust? Estimating the Average Treatment Effect in Randomized Experiments with Missing Covariates

Anqi Zhao, Assistant Professor
Department of Statistics and Data Science, National University of Singapore

Young Hall CS50

Abstract:

Randomized experiments allow for consistent estimation of the average treatment effect based on the difference in mean outcomes without strong modeling assumptions. Appropriate use of pretreatment covariates can further improve the estimation efficiency. Missingness in covariates is nevertheless common in practice and raises an important question: should we adjust for covariates subject to missingness, and if so, how? The unadjusted difference in means is always unbiased. The complete-covariate analysis adjusts for all completely observed covariates, and is asymptotically more efficient than the difference in means if at least one completely observed covariate is predictive of the outcome. Then what is the additional gain of adjusting for covariates subject to missingness? To reconcile the conflicting recommendations in the literature, we analyze and compare five strategies for handling missing covariates in randomized experiments under the design-based framework, and recommend the missingness-indicator method, as a known but not so popular strategy in the literature, due to its multiple advantages. First, it removes the dependence of the regression-adjusted estimators on the imputed values for the missing covariates. Second, it does not require modeling the missingness mechanism, and yields consistent estimators even when the missingness mechanism is related to the missing covariates and unobservable potential outcomes. Third, it ensures large-sample efficiency over the complete-covariate analysis and the analysis based on only the imputed covariates. Lastly, it is easy to implement via least squares. We also propose modifications to it based on asymptotic and finite sample considerations. Importantly, our theory views randomization as the basis for inference, and does not impose any modeling assumptions on the data generating process or missingness mechanism.

Bio:

Anqi Zhao is an assistant professor in the Department of Statistics and Data Science, National University of Singapore (NUS). She got her PhD in statistics from Harvard in 2016 and joined NUS in 2019 after an excursion to the management consulting world. Her research interests include experimental design and causal inference from randomized experiments and observational studies.

Thursday, 02/09/2023, Time: 11:00am – 12:15pm PT
Combining biased and unbiased data for estimating stratified COVID-19 infection fatality rates

Gonzalo E. Mena, Postdoctoral Fellow
Department of Statistics, University of Oxford

Young Hall CS50

Abstract:

One major limitation of so-called ‘big data’ is that bigger sample sizes don’t lead to more reliable conclusions if data is corrupted by bias. However, responding to complex scientific and societal questions requires us to think about how to draw inferences out of such corrupted data efficiently. One emerging paradigm consists of suitably combining unbiased (but typically small and expensive) with biased (but cheap and bigger) datasets. Unfortunately, although Bayesian inference is a major workhorse of modern scientific research, methods for combining information within this paradigm are still lacking, and we often have to content ourselves with the suboptimal solution of throwing away all biased data.

In this talk, I will present a computationally efficient Bayesian method for combining biased and unbiased data enjoying theoretical guarantees. This method is based on a predictive philosophy: given a family of Bayesian models indexed by an unknown parameter representing how data should be merged, we seek to find the value that will best predict unobserved units given the rest of the observed ones. I study in-depth the performance of our method in the Gaussian case, showing that if D is greater than 8, then including biased data is always better than not doing so. Moreover, I show that it enjoys a certain robustness property, making it preferable to the best available baseline, the Green-Strawderman shrinkage estimator. This criterion can be seamlessly implemented through leave-one-out cross-validation in usual probabilistic programming pipelines, and I show through simulations that benefits manifest in more complex scenarios as well, for example, in hierarchical models.

I apply these methods to one important scientific and policy sensitive question: determining how COVID-19 lethality depends on age and socioeconomic status. This problem is remarkably hard since lethality is defined in terms of the true number of infections, a quantity that we typically observe with bias. Using small-area data from Chile, I present three stratified examples based on biased (administrative surveillance data), unbiased (a serosurvey) and biased + unbiased data to confirm the result that there is a strong dependence of lethality on socioeconomic status among younger populations.

Bio:

Gonzalo Mena is a Florence Nightingale Fellow in Computational Statistics and Machine Learning at the Department of Statistics, University of Oxford. Prior to that he was a Data Science Initiative Postdoctoral fellow at Harvard University. He earned his PhD in Statistics at Columbia University advised by Liam Paninski. Before his PhD, he obtained a bachelor’s degree in Mathematical Engineeging at Universidad of Chile, in his home country. His main research motivation is the development of statistical methods to address complex scientific and societal problems.

Thursday, 02/14/2023, Time: 11:00am – 12:15pm PT
Learning Systems in Adaptive Environments. Theory, Algorithms and Design

Aldo Pacchiano, Postdoctoral Researcher
Microsoft Research

Mathematical Sciences 8359

Abstract:

Recent years have seen great successes in the development of learning algorithms in static predictive and generative tasks, where the objective is to learn a model that performs well on a single test deployment and in applications with abundant data. Comparatively less success has been achieved in designing algorithms for deployment in adaptive scenarios where the data distribution may be influenced by the choices of the algorithm itself, the algorithm needs to adaptively learn from human feedback, or the nature of the environment is rapidly changing. These are some of the most important challenges in the development of ML driven solutions for technologies such as internet social systems, ML driven scientific experimentation, and robotics. To fully realize the potential of these technologies we will necessitate better ways of designing algorithms for adaptive learning. In this talk I propose the following algorithm design considerations for adaptive environments 1) data efficient learning, 2) generalization to unseen domains via effective knowledge transfer and 3) adaptive learning from human feedback. I will give an overview of my work along each of these axes and introduce a variety of open problems and research directions inspired by this conceptual framing.

Bio:

Aldo Pacchiano is a Postdoctoral Researcher at Microsoft Research NYC. He obtained his PhD at UC Berkeley where he was advised by Peter Bartlett and Michael Jordan. His research lies in the areas of Reinforcement Learning, Online Learning, Bandits and Algorithmic Fairness. He is particularly interested in furthering our statistical understanding of learning phenomena in adaptive environments and use these theoretical insights and techniques to design efficient and safe algorithms for scientific, engineering, and large-scale societal applications.