2025 – 2026 Acad. Year

Thursday 02/26/2026, Time: 2-3:15pm,Local Geometric Structure Detection in Random Graphs

Location: Public Affairs Building 2270

Shuangping Li, Assistant Professor
Department of Statistics and Data Science, Yale University

Abstract:

We consider the problem of detecting a locally planted geometric structure in an Erdős–Rényi random graph. Despite the presence of structure, all vertices are marginally indistinguishable, making detection inherently higher-order. We identify the information-theoretic detection threshold as a function of the ambient dimension and the size of the planted subgraph, and show that it exhibits sharp phase transitions. Our results unify and extend prior work on detecting global random geometric graphs, while revealing new behavior unique to localized geometry. We further study computational aspects of detection, showing that simple polynomial-time statistics succeed only up to a strictly smaller dimension than the information-theoretic limit, providing evidence for a computational–statistical gap via the low-degree framework. This is based on forthcoming joint work with Jinho Bok and Sophie Yu.

Bio:

Shuangping Li is an assistant Professor of Statistics and Data Science at Yale University. She received her Ph.D. in Applied and Computational Mathematics from Princeton University, where she was co-advised by Professors Allan Sly and Emmanuel Abbe. Prior to joining Yale, she was a Stein Fellow in the Department of Statistics at Stanford University. Her research lies at the intersection of probability theory, high-dimensional statistics, theory of algorithms, and theoretical machine learning.

Thursday 02/19/26, Time: 2-3:15pm, Two Disciplines, One Mission — A Comparative View on Making Sense of Imperfect Data from Statistical Science to Machine Learning

Location: Public Affairs Building 2270

Grace Y. Yi, Professor
Department of Statistical and Actuarial Sciences & Department of Computer Science, University of Western Ontario

Abstract:

In the data-driven era, data quality plays a pivotal role in ensuring valid statistical inference and robust machine learning performance. Yet, imperfections such as measurement error in predictors and label noise in supervised learning are pervasive across a wide range of domains, including health sciences, epidemiology, economics, and beyond. These imperfections can obscure true patterns, introduce bias, and compromise the reliability of analyses. Such issues have attracted extensive attention from both the statistical and machine learning communities. In this talk, I will offer a brief comparative review of approaches in statistical science and machine learning, highlighting the importance of addressing data quality issues and developing strategies to mitigate their adverse effects on inference and prediction. “Grace Y. Yi is a Professor and Tier I Canada Research Chair in Data Science at the University of Western Ontario. She is the author of the monograph “Statistical Analysis with Measurement Error or Misclassification: Strategy, Method, and Application” (2017) and co-editor of the “Handbook of Measurement Error Models” (Grace Y. Yi, Aurore Delaigle, and Paul Gustafson, 2021). She is also a coauthor of the monograph Likelihood and its Extensions (with Nancy Reid and Cristiano Varin, 2026).

Bio:

Professor Yi is the 2025 Gold Medalist of the Statistical Society of Canada (SSC). She is a Fellow of the Institute of Mathematical Statistics and the American Statistical Association, and an Elected Member of the International Statistical Institute. She received the Award for Excellence in Graduate Student Mentoring from the University of Western Ontario (2023), and delivered the Myra Samuels Memorial Lecture at Purdue University (2025).

Professor Yi served as co-editor-in-chief of the Electronic Journal of Statistics (2022–2024), editor-in-chief of the Canadian Journal of Statistics (2016–2018), and is currently serving as editor of the methodology section of the New England Journal of Statistics in Data Science. She has served as president of the Statistical Society of Canada (2021–2022) and as chair of the Lifetime Data Science Section of the American Statistical Association (2023). In 2012 she founded the first chapter of the International Chinese Statistical Association (ICSA) – the Canada Chapter.

Thursday 02/05/26, Time: 2-3:15pm, Denoising Differentially Private Optimizers

Location: Public Affairs Building 2270

Meisam Razaviyayn, Associate Professor
Departments of Industrial and Systems Engineering, Computer Science, Quantitative and Computational Biology, and Electrical Engineering at the University of Southern California

Abstract:

Differential Private Optimization provides a robust framework for safeguarding individual data during training process of machine learning models. However, the substantial noise injection required (typically added after gradient clipping) often disrupts optimizer dynamics and severely degrades performance in large-scale training. To address this challenge, we introduce a general, optimizer-agnostic framework for denoising privatized gradients. Operating as a modular wrapper, our approach uses noisy gradient observations and provides refined estimates to the optimizer, requiring no internal modifications to standard algorithms such as SGD or Adam, without losing any privacy.

We ground our method in the Kalman Filtering Mechanism and optimal despising of Taylor expansion of the objective function. We translate these theoretical insights into practical, memory-efficient filtering strategies (such as low-pass and Kalman filtering) that generate progressively refined gradient estimations. We establish rigorous privacy-utility trade-off guarantees for these mechanisms, ensuring they remain practical for large-scale applications. Extensive experiments across diverse domains, including vision tasks (CIFAR-100, ImageNet-1k) and language fine-tuning (GLUE, E2E, DART), demonstrate that this framework significantly outperforms state-of-the-art DP baselines, effectively mitigating the utility loss caused by privacy-preserving noise.

Bio:

Meisam Razaviyayn (https://sites.usc.edu/razaviyayn) is an associate professor in the departments of Industrial and Systems Engineering, Computer Science, Quantitative and Computational Biology, and Electrical Engineering at the University of Southern California. He also serves as the associate director of the USC-Meta Center for Research and Education in AI and Learning (https://realai.usc.edu) and is a Faculty Visitor at Google Research. Before joining USC, Meisam was a postdoctoral research fellow in the Department of Electrical Engineering at Stanford University. He earned his PhD in Electrical Engineering with a minor in Computer Science from the University of Minnesota, where he also received his M.Sc. in Mathematics. His research and academic efforts have been recognized with numerous awards, including the 2022 NSF CAREER Award, the 2022 Northrop Grumman Excellence in Teaching Award, the 2021 AFOSR Young Investigator Award, and the 2021 3M Nontenured Faculty Award. He received the 2020 ICCM Best Paper Award in Mathematics and the IEEE-DSW Best Paper Award in 2019, along with the Signal Processing Society Young Author Best Paper Award in 2014. Meisam was among the selected individuals by the National Academy of Engineering for the Frontiers of Engineering Symposium in 2023. Additionally, he was a finalist for the Best Paper Prize for Young Researchers in Continuous Optimization in 2013 and 2016, and a silver medalist in Iran’s National Mathematics Olympiad. His research focuses on the design and analysis of fundamental optimization algorithms relevant to the modern AI era.

Thursday 01/29/26, Time: 2-3:15pm, Estimating SNR in High-Dimensional Linear Models

Location: Public Affairs Building 2270

Xiaodong Li, Associate Professor
Department of Statistics, UC Davis

Abstract:

This talk develops robust methods for estimating signal-to-noise ratios (SNR) and variance components in high-dimensional linear models. We first show that the random-effects MLE remains consistent and asymptotically normal under substantial model misspecification, including fixed coefficients and heteroskedastic errors. We then extend the method-of-moments framework to multivariate responses, deriving asymptotic distributions using moment identities of the Wishart distribution. The resulting procedures require no sparsity assumptions and provide heteroskedasticity-robust inference through an explicit variance–inflation correction. Simulations demonstrate that the proposed confidence intervals achieve reliable coverage across a wide range of high-dimensional settings.

Bio:

Dr. Xiaodong Li is currently an associate professor in the statistics department at UC Davis. He is mainly interested in methodology and theory in high-dimensional statistics and learning, particularly the interaction between optimization and statistics. His current research focuses include high-dimensional statistical inference, non-convex optimization, and network analysis. Dr. Li has received various awards including NSF Career Award, 2019 Information Theory Paper Award, and 2022-23 UC Davis Chancellor’s Fellow. He is currently serving as an associate editor for Journal of Multivariate Analysis.

Thursday 01/22/26, Time: 2-3:15pm, Title: Community Detection with the Bethe-Hessian

Location: Public Affairs Building 2270

Yizhe Zhu, Assistant Professor
Department of Mathematics, University of Southern California

Abstract:

The Bethe-Hessian matrix, introduced by Saade, Krzakala, and Zdeborová (2014), is a Hermitian matrix designed for applying spectral clustering algorithms to sparse networks. Rather than employing a non-symmetric and high-dimensional non-backtracking operator, a spectral method based on the Bethe-Hessian matrix is conjectured to also reach the Kesten-Stigum detection threshold in the sparse stochastic block model (SBM). We provide the first rigorous analysis of the Bethe-Hessian spectral method in the SBM under both the bounded expected degree and the growing degree regimes. Joint work with Ludovic Stephan.

Bio:

Yizhe Zhu is an Assistant Professor of Mathematics at the University of Southern California. His research lies at the interface of probability, combinatorics, and data science, with a focus on random matrices and random graphs. Prior to joining USC, Dr. Zhu was a Visiting Assistant Professor at the University of California, Irvine, and a postdoctoral fellow at the Simons Laufer Mathematical Sciences Institute in Berkeley. He received his Ph.D. in Mathematics from the University of California, San Diego in 2021.

Thursday 01/15/26, Time: 2-3:15pm, Recent Advances in Experimental Design: Construction using Optimization Algorithms and Generative AI

Location: Public Affairs Building 2270

Alan Vazquez, Assistant Professor
Department of Industrial Engineering, Tecnologico de Monterrey, Mexico

Abstract:

Experimental design is a field of statistics that deals with the planning and analysis of physical experiments, computer experiments, and clinical trials. This talk will present two recent advances in the construction of two-arm clinical trials using optimization algorithms, and in the generation of two-level fractional factorial designs using popular large language models (LLMs). Specifically, the first topic concerns the construction of two-arm trials for personalized medicine applications using novel statistical criteria and integer programming. We will demonstrate the capabilities of our methodology using simulated and real datasets. The second topic concerns a systematic assessment of GPT and Gemini models to construct two-level fractional factorial designs with 8, 16, and 32 runs. To this end, we develop a prompt template using popular prompting techniques. We compare the designs obtained by the LLMs with the optimal designs in terms of statistical criteria.

Bio:

Dr. Alan Vazquez is an Assistant Professor in the Department of Industrial Engineering at Tecnologico de Monterrey, Mexico. His main research area involves the use of optimization algorithms to construct and analyze cost-effective experimental plans. His research is featured in several high-impact statistics journals and implemented in the experimental design software called EFFEX. Dr. Vazquez is part of the editorial board of the Journal of Quality Technology and the Quality Engineering journal, and a council member of the Quality, Statistics, and Reliability (QSR) section of INFORMS. From 2020 to 2022, he was an Assistant Adjunct Professor at the Department of Statistics and Data Science at UCLA.

Thursday 01/08/26, Time: 2-3:15pm, Title: How to Use Synthetic Data for Improved Statistical Inference?

Location: Public Affairs Building 2270

Edgar Dobriban, Associate Professor in Statistics and Data Science
The Wharton School, The University of Pennsylvania

Abstract:

The rapid proliferation of high-quality synthetic data — generated by advanced AI models or collected as auxiliary data from related tasks — presents both opportunities and challenges for statistical inference. Here, we introduce the GEneral Synthetic-Powered Inference (GESPI) framework that wraps around any statistical inference procedure to safely enhance sample efficiency by combining synthetic and real data. Our framework leverages high-quality synthetic data to boost statistical power, yet adaptively defaults to the standard inference method using only real data when synthetic data is of low quality. The error of our method remains below a user-specified bound without any distributional assumptions on the synthetic data, and decreases as the quality of the synthetic data improves. This flexibility enables seamless integration with conformal prediction, risk control, hypothesis testing, and multiple testing procedures, all without modifying the base inference method. We demonstrate the benefits of our method on challenging tasks with limited labeled data, including AlphaFold protein structure prediction, and comparing large reasoning models on complex math problems.

Bio:

Edgar Dobriban is an associate professor in the Department of Statistics and Data Science at the University of Pennsylvania, with a secondary appointment in Computer and Information Science. He obtained a PhD in statistics from Stanford University in 2017. His research interests are at the interface of statistics, machine learning, and AI.

Thursday 12/04/25, Time: 11:00am – 12:15pm, Searching for local associations while controlling the false discovery rate

Location: Public Affairs 2270

Matteo Sesia, Associate Professor
Department of Data Sciences and Operations, USC Marshall School of Business

Abstract:

This talk describes a new method to test local conditional hypotheses that express how the relation between explanatory variables and outcomes changes across different contexts, described by covariates. By expanding upon the model-X knockoff filter, this method can adaptively discover local associations while controlling the false discovery rate. Its inferences can help explain sample heterogeneity and uncover interactions, making better use of the capabilities offered by modern machine learning models. Specifically, it is able to leverage any model for the identification of data-driven hypotheses pertaining to different contexts. Then, it rigorously test these hypotheses without succumbing to selection bias. Importantly, this approach is efficient and does not require sample splitting. Its effectiveness is demonstrated through numerical experiments and by studying the genetic architecture of Waist-Hip-Ratio across different sexes in the UKBiobank.

Bio:

Matteo Sesia is an Associate Professor of Data Sciences and Operations at the University of Southern California – Marshall School of Business, with a courtesy appointment in the USC Department of Computer Science. His research lies at the intersection of statistics and machine learning, focusing on developing rigorous and practical methods for analyzing high-dimensional and noisy data in settings where traditional modeling assumptions may not hold. He joined USC in 2020 after completing his Ph.D. in Statistics at Stanford University under the supervision of Emmanuel Candès.

Thursday 11/20/25, Time: 3-4pm, Data Theory Seminar: Mathematics of Cryo-Electron Microscopy

Location: CHS 43-105 (Center for Health Sciences Building)

Amit Singer, Professor
Department of Mathematics, Princeton University

Abstract:

Cryo-EM is a Nobel Prize winning technology for determining 3-D biological molecular structures at high resolution. Reconstruction in cryo-EM is an inverse problem that involves many different fields of mathematics including statistical inference, optimization (convex and non-convex), numerical analysis, dimension reduction, representation theory, information theory, and more. We will discuss the mathematical and statistical foundation underlying computational methods for 3-D reconstruction, focusing on the challenges of reconstructing small size molecules and the reconstruction of flexible molecules. In passing, we will contrast modern deep learning algorithms with classical applied math and statistical methods.

Bio:

Amit Singer is a Professor of Mathematics, the Director of the Program in Applied and Computational Mathematics (PACM), and a member of the Executive Committee for the Center for Statistics and Machine Learning (CSML) at Princeton University. He joined Princeton as an Assistant Professor in 2008. From 2005 to 2008 he was a Gibbs Assistant Professor in Applied Mathematics at the Department of Mathematics, Yale University. Singer received the BSc degree in Physics and Mathematics and the PhD degree in Applied Mathematics from Tel Aviv University (Israel), in 1997 and 2005, respectively. His list of awards includes SIAM Fellow (2022), the Simons Math+X Investigator Award (2017), a National Finalist for Blavatnik Awards for Young Scientists (2016), Moore Investigator in Data-Driven Discovery (2014), Simons Investigator Award (2012), Presidential Early Career Award for Scientists and Engineers (2010), the Alfred P. Sloan Research Fellowship (2010) and the Haim Nessyahu Prize for Best PhD in Mathematics in Israel (2007). His current research in applied mathematics focuses on theoretical and computational aspects of data science, and on developing computational methods for structural biology.

Thursday 11/20/25, Time: 11:00am – 12:15pm, Model-Free Approaches to Constructing Prediction Regions under Target Shift

Location: Public Affairs 2270

Huixia Judy Wang, Professor and Chair
Department of Statistics, Rice University

Abstract:

In many real-world applications, obtaining labeled data is a significant challenge due to high costs and technical limitations. This scarcity of labeled outcomes presents a major obstacle for traditional statistical inference. To address this, we introduce a model-free approach for constructing prediction regions for new target outcomes. Our method leverages a labeled source distribution, which is different from the target but related through a distributional shift, to overcome the lack of target labels. When target data are fully unlabeled, our predictions rely entirely on the rich source data; when some labels are available, we seamlessly integrate them to boost efficiency. A key innovation in this new approach lies in how we handle the complexities of different data distributions. We tackle non-exchangeability and non-identifiability by estimating the likelihood ratio through a novel technique: matching the covariate distributions of the source and target domains using a B-spline basis. This powerful approach allows us to accommodate complex error structures, including asymmetry and multimodality. To this end, we construct the highest predictive density sets using a new weight-adjusted conditional density estimator. This estimator models the source conditional density and then transforms it through a weighting scheme to accurately approximate the target conditional density. We will discuss the theoretical guarantees of our method and demonstrate its strong performance. We validate our approach through comprehensive simulation studies and a compelling real-world application using the MIMIC-III clinical database. This is a joint work with Menghan Yi and Yanlin Tang.

Bio:

Dr. Huixia Judy Wang is the William Marsh Rice Trustee Professor in Data Science and Chair of the Department of Statistics at Rice University. Her prior academic appointments include faculty positions at The George Washington University and North Carolina State University. She also served as a Program Director at the National Science Foundation from 2018 to 2022. Dr. Wang’s research interests include biostatistics, high-dimensional inference, quantile regression, and extreme value analysis. Her work has been recognized with the NSF CAREER award, the Tweedie New Researcher Award from the Institute of Mathematical Statistics (IMS), and the IMS Medallion Lectureship. She is an elected Fellow of the American Statistical Association (ASA) and the IMS, and an elected member of the International Statistical Institute (ISI). She is currently the co-editor of Statistica Sinica and serves as an Associate Editor for the Journal of the American Statistical Association.

Thursday 11/13/25, Time: 11:00am – 12:15pm, Modern Gaussian Processes for Neuroimaging Data Analysis

Location: Public Affairs 2270

Jian Kang, Professor and Associate Chair for Research
School of Public Health, University of Michigan

Abstract:

Recent advances in neuroimaging have produced massive and heterogeneous datasets, ranging from fMRI with high spatial resolution to EEG with high temporal resolution, characterized by complex spatiotemporal correlations and substantial inter-subject variability. Traditional regression models and Gaussian process (GP) approaches with fixed parametric kernels often fail to model such complex data effectively while maintaining scalability and interpretability. This talk introduces a family of modern Bayesian GP frameworks that integrate deep kernel learning, neural network priors, and geometric modeling for large-scale neuroimaging analysis. An example is the Deep Kernel Learning Process (DKLP), which embeds deep neural networks within GP priors to learn data-adaptive covariance structures directly from imaging data. DKLP provides a unified modeling foundation for image-on-scalar, scalar-on-image, and image-on-image regression, supported by theoretical guarantees and efficient posterior computation. Applications to fMRI data from the Adolescent Brain Cognitive Development (ABCD) study reveal reproducible cortical activation patterns associated with cognitive ability, while analyses of EEG-based brain–computer interface data demonstrate robust neural decoding under high noise. I will also discuss scalable heat-kernel GPs on manifolds and thresholded GP–based spatially varying neural network priors, which together expand the scope of Bayesian inference for complex neuroimaging data.

Bio:

Jian Kang is Professor and Associate Chair for Research in the Department of Biostatistics at the University of Michigan, Ann Arbor. He received his B.S. in Statistics from Beijing Normal University in 2005, M.S. in Mathematics from Tsinghua University in 2007, and Ph.D. in Biostatistics from the University of Michigan in 2011. His research focuses on Bayesian modeling, statistical machine learning, and large-scale data integration, with applications in neuroimaging, metabolomics, and precision medicine. Dr. Kang is a Fellow of the American Statistical Association and has served as Associate Editor for the Journal of the American Statistical Association, Annals of Applied Statistics, Biometrics, and Statistics in Medicine. He received the 2025 University of Michigan School of Public Health Excellence in Research Award, the 2024 International Chinese Statistical Association President’s Citation Award, and the 2022 “Best Paper in Biometrics” Award from the International Biometric Society.

Wednesday 11/12/25, Time: 3:00pm – 4:00pm, Stein-Log-Sobolev inequalities for the continuous Stein variational gradient descent method

Location: Mathematical Sciences 8359

José A. Carrillo, Professor of Nonlinear Partial Differential Equations
Mathematical Institute, University of Oxford

Abstract:

The Stein Variational Gradient Descent method is a variational inference method in statistics that has recently received a lot of attention. The method provides a deterministic approximation of the target distribution, by introducing a nonlocal interaction with a kernel. Despite the significant interest, the exponential rate of convergence for the continuous method has remained an open problem, due to the difficulty of establishing the related so called Stein-log-Sobolev inequality. Here, we prove that the inequality is satisfied for each space dimension and every kernel whose Fourier transform has a quadratic decay at infinity and is locally bounded away from zero and infinity. Moreover, we construct weak solutions to the related PDE satisfying exponential rate of decay towards the equilibrium. The main novelty in our approach is to interpret the Stein-Fisher information, also called the squared Stein discrepancy, as a duality pairing between H⁻¹(ℝⁿ) and H¹(ℝⁿ), which allows us to employ the Fourier transform. We also provide several examples of kernels for which the Stein-log-Sobolev inequality fails, partially showing the necessity of our assumptions.

Bio:

José A. Carrillo is currently Professor of the Analysis of Nonlinear Partial Differential Equations at the Mathematical Institute and Tutorial Fellow in Applied Mathematics at The Queen’s College, University of Oxford. He mainly works on kinetic and nonlinear nonlocal diffusion equations. He has contributed to the theoretical and numerical analysis of PDEs, and their simulation in different applications such as granular media, semiconductors, collective behaviour, and lately in plasmas and tissue modelling. He is currently officer at large at the International Council for Industrial and Applied Mathematics 2024-2028 and head of the Division of the European Academy of Sciences, Section Mathematics where he was elected in 2018. He received the 2022 Echegaray Medal of the Royal Spanish Academy of Sciences and was a plenary speaker at the International Congress on Industrial and Applied Mathematics in Tokyo 2023.

Thursday 11/06/25, Time: 11:00am – 12:15pm, A Performance-Based Framework for Transfer Learning Measurement and Guidance

Location: Public Affairs 2270

Helen Hao Zhang, Professor and Chair
Statistics and Data Science GIDP, University of Arizona

Abstract:

Transfer learning has become a cornerstone of modern machine learning, enabling the transfer of knowledge from data-rich source domains to data-scarce target domains. However, determining whether transfer learning will be beneficial prior to implementation remains a critical challenge. This work proposes a novel performance-based framework to measure similarity between source and target datasets and quantify transferability. Theoretical justifications are provided by connecting the new measure to the cosine similarity between decision boundaries in supervised learning. Key advantages of the proposed framework include its nonparametric and flexible nature, easy implementation, and computational scalability without requiring estimation of the underlying data distribution. We further suggest practical guidance that categorizes source datasets into positive, ambiguous, or negative zones based on their transferability. Finally, we extend this approach to encoder-head architectures in deep learning. Numerical results and real-world applications are presented to demonstrate the empirical effectiveness of the framework.

Bio:

Dr. Hao Helen Zhang is Professor of Mathematics and Chair of the Statistics and Data Science Interdisciplinary Program at the University of Arizona. Her research focuses on nonparametric models, statistical machine learning, and high-dimensional data, with applications in biomedical and engineering fields. She has published over 100 research articles and received numerous funding support from NSF, NIH, and NSA, including the NSF CAREER and NSF TRIPODS awards. Dr. Zhang has served as Editor-in-chief of STAT, and Associate Editor of JASA and JRSSB. She is Fellow of IMS, Fellow of ASA, and an IMS Medallion Lecturer.

Thursday 10/23/25, Time: 11:00am – 12:15pm, Modeling Non-Uniform Hypergraphs Using Determinantal Point Processes

Location: Public Affairs 2270

Ji Zhu, Susan A. Murphy Collegiate Professor
Department of Statistics, University of Michigan

Abstract:

Most statistical models for networks focus on pairwise interactions between nodes. However, many real-world networks involve higher-order interactions among multiple nodes, such as co-authors collaborating on a paper. Hypergraphs provide a natural representation for these networks, with each hyperedge representing a set of nodes. The majority of existing hypergraph models assume uniform hyperedges (i.e., edges of the same size) or rely on diversity among nodes. In this work, we propose a new hypergraph model based on non-symmetric determinantal point processes. The proposed model naturally accommodates non-uniform hyperedges, has tractable probability mass functions, and accounts for both node similarity and diversity in hyperedges. For model estimation, we maximize the likelihood function under constraints using a computationally efficient projected adaptive gradient descent algorithm. We establish the consistency and asymptotic normality of the estimator. Simulation studies confirm the efficacy of the proposed model, and its utility is further demonstrated through edge predictions on several real-world datasets.

Bio:

Ji Zhu is Susan A. Murphy Collegiate Professor of Statistics at the University of Michigan, Ann Arbor. He received his B.Sc. in Physics from Peking University, China in 1996 and M.Sc. and Ph.D. in Statistics from Stanford University in 2000 and 2003, respectively. His primary research interests include statistical machine learning, statistical network analysis, and their applications to health and natural sciences. He received an NSF CAREER Award in 2008 and was elected as a Fellow of the American Statistical Association in 2013 and a Fellow of the Institute of Mathematical Statistics in 2015. From 2014 to 2020, he was recognized as an ISI Highly Cited Researcher by Web of Science, which annually lists leading researchers in the sciences and social sciences worldwide. In 2022, he received the International Chinese Statistical Association Pao-Lu Hsu Award. He served as the Editor-in-Chief of the Annals of Applied Statistics from 2022 to 2024.

Thursday 10/16/25, Time: 11:00am – 12:15pm, Nonlinear global Fréchet regression for random objects via weak conditional expectation

Location: Public Affairs 2270

Lingzhou Xue, Professor
Department of Statistics, The Pennsylvania State University

Abstract:

Random objects are complex non-Euclidean data taking values in general metric spaces, possibly devoid of any underlying vector space structure. Such data are becoming increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semidefinite matrices, and data on Riemannian manifolds. However, except for regression for object-valued responses with Euclidean predictors and distribution-on-distribution regression, there has been limited development of a general framework for object-valued responses with object-valued predictors in the literature. To fill this gap, we introduce the notion of a weak conditional Fréchet mean based on Carleman operators and then propose a global nonlinear Fréchet regression model through the reproducing kernel Hilbert space (RKHS) embedding. Furthermore, we establish the relationships between the conditional Fréchet mean and the weak conditional Fréchet mean for both Euclidean and object-valued data. We also show that the state-of-the-art global Fréchet regression developed by Petersen and Müller (Ann. Statist. 47 (2019) 691–719) emerges as a special case of our method by choosing a linear kernel. We require that the metric space for the predictor admit a reproducing kernel. In contrast, the intrinsic geometry of the metric space for the response is utilized to study the asymptotic properties of the proposed estimates. Numerical studies, including extensive simulations and a real application, are conducted to investigate the finite-sample performance.

Bio:

Lingzhou Xue is a Professor of Statistics at Penn State. He received his Ph.D. in Statistics from the University of Minnesota in 2012, and he was a postdoctoral research associate at Princeton University from 2012 to 2013. His research interests include high-dimensional statistics, nonparametric statistics, machine learning, optimization, and statistical modeling in biomedical, environmental, and social sciences. He is a dedicated mentor to Ph.D. students and postdoctoral researchers, and five of his former advisees have become tenure-track faculty members in statistics. He became an Elected Fellow of the Institute of Mathematical Statistics (IMS) in 2024, an Elected Fellow of the American Statistical Association (ASA) in 2023, and an Elected Member of the International Statistical Institute (ISI) in 2016. He received the inaugural Committee of Presidents of Statistical Societies (COPSS) Emerging Leader Award in 2021, the inaugural Bernoulli Society New Researcher Award in 2019, and the International Consortium of Chinese Mathematicians Best Paper Award in 2019.

Tuesday 10/14/25, Time: 11:00am – 12:15pm, Data-Efficient Kernel Methods for Learning Differential Equations and Their Solution Operators

Location: Mathematical Sciences Building 8359

Houman Owhadi, Professor of Applied and Computational Mathematics and Control and Dynamical Systems
Computing and Mathematical Sciences Department, California Institute of Technology

Abstract:

We introduce a novel kernel-based framework for learning differential equations and their solution maps, which is efficient in terms of data requirements (both the number of solution examples and the amount of measurements from each example), as well as computational cost and training procedures. Our approach is mathematically interpretable and supported by rigorous theoretical guarantees in the form of quantitative worst-case error bounds for the learned equations and solution operators. Numerical benchmarks demonstrate significant improvements in computational complexity and robustness, achieving one to two orders of magnitude improvement in accuracy compared to state-of-the-art algorithms. This presentation is based on joint work with Yasamin Jalalian, Juan Felipe Osorio Ramirez, Alexander Hsu, and Bamdad Hosseini. A preprint is available at: https://arxiv.org/abs/2503.01036

Bio:

Houman Owhadi is an IBM professor of applied and computational mathematics and control and dynamical systems at the California Institute of Technology. His expertise includes uncertainty quantification, numerical approximation, statistical inference/learning, data assimilation, stochastic and multiscale analysis, and scientific machine learning. He was a plenary speaker at SIAM CSE 2015, SIAM UQ 2024 and EMI 2025, and a tutorial speaker at SIAM UQ 2016. He received the 2019 Germund Dahlquist SIAM Prize. He is a SIAM Fellow (class of 2022) and a Vannevar Busch Fellow (class of 2024).

Thursday 10/09/25, Time: 11:00am – 12:15pm, Unadorned Statistics in the Light of AI

Location: Public Affairs 2270

Heping Zhang, Susan Dwight Bliss Professor of Biostatistics
Yale University

Abstract:

Regression, clustering, and sequential analysis are fundamental techniques in statistics. Today, these same concepts are often relabeled as supervised learning, unsupervised learning, deep learning, reinforcement learning, or, more broadly, artificial intelligence. In this talk, I will present several of our statistical methods, developed in response to real-world applications, including the analysis of high-dimensional data for building-related occupant syndromes, inference of risk factors with uncertain frequencies from haplotype data, and residual diagnostics for generalized linear models. By revisiting these examples, I will highlight the essential ideas and techniques that our approaches share with modern AI methods. My goal is to reflect on why our statistical methods appear so “unadorned,” and to ask whether—and how—we might close the gap in how statistics and AI are recognized and valued.

Bio:

Heping Zhang, Ph.D., is the Susan Dwight Bliss Professor of Biostatistics at the Yale University School of Public Health. He also holds secondary appointments as Professor in the Child Study Center and the Department of Obstetrics, Gynecology, and Reproductive Sciences at the Yale School of Medicine, and in the Department of Statistics and Data Science at Yale University. He is the founding director of the Collaborative Center for Statistics in Science at Yale. Dr. Zhang is a Fellow of both the American Statistical Association and the Institute of Mathematical Statistics. He was the founding Editor-in-Chief of Statistics and Its Interface and previously served as an editor of the Journal of the American Statistical Association – Applications and Case Studies. His honors include the 2008 Myrto Lefkopoulou Distinguished Lecture at the Harvard School of Public Health, the 2011 IMS Medallion Lecture and Award, the 2022 Neyman Lecture and Award, the 2023 Distinguished Achievement Award from the International Chinese Statistical Association, and recognition as a 2023 Highly Cited Researcher by the Web of Science.

Thursday 10/02/25, Time: 11:00am – 12:15pm, Systems Learning of Single Cells

Location: Public Affairs 2270

Qing Nie, Distinguished Professor of Mathematics and Developmental & Cell Biology
University of California, Irvine

Abstract:

Cells make fate decisions in response to dynamic environments, and multicellular structures emerge from multiscale interplays among cells and genes in space and time. While single-cell omics data provides an unprecedented opportunity to profile cellular heterogeneity, the technology requires fixing the cells, often leading to a loss of spatiotemporal and cell interaction information. How to reconstruct temporal dynamics from single or multiple snapshots of single-cell omics data? How to recover interactions among cells, for example, cell-cell communication from single-cell gene expression data? I will present a suite of our recently developed computational methods that learn the single-cell omics data as a spatiotemporal and interactive system. Those methods are built on a strong interplay among systems biology modeling, dynamical systems approaches, machine-learning methods, and optimal transport techniques. The tools are applied to various complex biological systems in development, regeneration, and diseases to show their discovery power. Finally, I will discuss the methodology challenges in systems learning of single-cell data. Dr. Qing Nie is a University of California Presidential Chair and UCI Excellence in Teaching Chair, and a Distinguished Professor of Mathematics and Developmental & Cell Biology at University of California, Irvine. In research, Dr. Nie uses systems biology and data-driven methods to study complex biological systems with focuses on single-cell analysis, multiscale modeling, cellular plasticity, stem cells, embryonic development, and their applications to diseases. In training, Dr. Nie has supervised more than 60 postdoctoral fellows and PhD students, with many of them working in academic institutions now.

Bio:

Dr. Nie is a fellow of the American Association for the Advancement of Science (AAAS), a fellow of American Physical Society (APS), a fellow of Society for Industrial and Applied Mathematics (SIAM), and a fellow of American Mathematical Society (AMS). In 2025, Dr. Nie was ranked #1 based on the data analytics of publications and citations by ScholarGPS in the Highly Ranked Scholar list for two areas: Single-cell Transcriptomics and Transcriptomics Technologies, for the prior five years.

Thursday 09/25/25, Time: 11:00am – 12:15pm, Asymptotic FDR Control with Model-X Knockoffs: Is Moments Matching Sufficient?

Location: Public Affairs 2270

Yingying Fan, Centennial Chair in Business Administration & Professor of Data Sciences and Operations
USC Marshall School of Business

Abstract:

We propose a unified theoretical framework for studying the robustness of the model-X knockoffs framework by investigating the asymptotic false discovery rate (FDR) control of the practically implemented approximate knockoffs procedure. This procedure deviates from the model-X knockoffs framework by substituting the true covariate distribution with a user-specified distribution that can be learned using in-sample observations. By replacing the distributional exchangeability condition of the model-X knockoff variables with three conditions on the approximate knockoff statistics, we establish that the approximate knockoffs procedure achieves the asymptotic FDR control. Using our unified framework, we further prove that an arguably most popularly used knockoff variable generation method–the Gaussian knockoffs generator based on the first two moments matching–achieves the asymptotic FDR control when the two-moment-based knockoff statistics are employed in the knockoffs inference procedure. For the first time in the literature, our theoretical results justify formally the effectiveness and robustness of the Gaussian knockoffs generator. Simulation and real data examples are conducted to validate the theoretical findings.

Bio:

Yingying Fan is Associate Dean for the PhD Program, Centennial Chair in Business Administration, Professor in Data Sciences and Operations Department of the Marshall School of Business at the University of Southern California. Her research interests include statistics, data science, machine learning, economics, and big data and business applications. Her latest works have focused on knockoff inference, causal inference, and LLM model applications. She is the recipient of the Institute of Mathematical Statistics Medallion Lecture, Fellow of Institute of Mathematical Statistics, Fellow of American Statistical Association, the Royal Statistical Society Guy Medal in Bronze, the American Statistical Association Noether Young Scholar Award, and the NSF Faculty Early Career Development (CAREER) Award. She is a Co-Editor of the Journal of Business and Economic Statistics and Statistics Surveys.