Thursday, 06/03/2021, Time: 11:00am – 12:15pm PSTOn the maximum and expected complexity of maxout networks
Guido Montufar, Assistant Professor
Departments of Mathematics and Statistics, UCLA
Learning with artificial neural networks relies on the complexity of the functions that can be represented by the network and also the particular way it assigns typical parameters to functions of different complexity. For networks with piecewise linear activations, the number of activation regions over the input space is a complexity measure with implications in depth separation, approximation errors, optimization, robustness. In this talk I present recent advances on the maximum and expected complexity of the functions represented by networks with maxout units, which can be regarded as a multi-argument generalization of rectified linear units. In the first part, I present counting formulas and sharp upper bounds for the number of linear regions, with connections to Minkowski sums of polytopes. In the second part, I discuss the behavior for generic parameters and present upper bounds on the expected number of regions given a probability distribution over the parameters, showing that, similar to networks with rectified linear units, for typical parameters the expected complexity of maxout networks can be much lower than the maximum.
This talk is based on joint works with Yue Ren, Leon Zhang, and Hanna Tseran.
Guido Montúfar is an Assistant Professor at the Department of Mathematics and the Department of Statistics at UCLA. He studied mathematics and theoretical physics at the TU Berlin and completed the PhD at the Max Planck Institute for Mathematics in the Sciences. Guido is interested in mathematical machine learning, especially the interplay of capacity, optimization, and generalization in deep learning. Since 2018 he is the PI of the ERC starting grant project Deep Learning Theory. His research interfaces with information geometry, optimal transport, and algebraic statistics.
Thursday, 05/27/2021, Time: 11:00am – 12:15pm PSTMapping Item-Response Interactions: A Latent Space Approach to Item Response Data with Interaction Maps
Minjeong Jeon, Associate Professor
Department of Education, UCLA
In this talk, I introduce a novel latent space modeling approach to educational and psychological assessment data. In this approach, respondents’ binary responses to test items are viewed as a bipartite network between respondents and items where a tie is made when a respondent gives a correct (or positive) answer to an item. The resulting latent space model provides a window into respondents’ performance in the assessment, placing respondents and test items in a shared metric space referred to as an interaction map. The interaction map approach can help assess students’ strengths and weaknesses from cognitive assessment and identify patients’ symptom profiles from clinical assessment data. I will illustrate the utilities of the proposed approach, focusing on how the interaction map can help derive insightful diagnostic information on items and respondents.
Minjeong Jeon, Ph.D. is an Associate Professor of Advanced Quantitative Methods in the UCLA Department of Education. Dr. Jeon received her Ph.D in Quantitative Methods from UC Berkeley. Before joining UCLA faculty, she was an Assistant Professor of Quantitative Psychology at Ohio State University. Her research revolves around developing, applying, and estimating latent variable models for studying measurement and growth. Her recent research topics include latent space modeling, process modeling, and joint analysis.
Thursday, 05/20/2021, Time: 11:00am – 12:15pm PSTPrediction Games: From Maximum Likelihood Estimation to Active Learning, Fair Machine Learning, and Structured Prediction
Brian Ziebart, Associate Professor
Department of Computer Science, University of Illinois at Chicago
A standard approach to supervised machine learning is to choose the form of a predictor and to then optimize its parameters based on training data. Approximations of the predictor’s performance measure are often required to make this optimization problem tractable. Instead of approximating the performance measure and using the exact training data, this talk explores game-theoretic approximations of the training data while optimizing the exact performance measures of interest. Though the resulting “prediction games” reduce to maximum likelihood estimation in simple cases, they provide new methods for more complicated prediction tasks involving covariate shift, fairness constraint satisfaction, and structured data.
Brian Ziebart is an Associate Professor in the Department of Computer Science at the University of Illinois at Chicago and a Software Engineer at Aurora Innovation. He earned his PhD in Machine Learning from Carnegie Mellon University where he was also a postdoctoral fellow. His interests lie in the intersections between machine learning, game theory, and decision theory. He has published over 35 articles in leading machine learning and artificial intelligence venues, including a Best Paper at the International Conference on Machine Learning.
Thursday, 05/13/2021, Time: 11:00am – 12:15pm PSTBayesian Regression Tree Models for Causal Inference
Carlos M. Carvalho, Professor
The University of Texas McCombs School of Business
This paper presents a novel nonlinear regression model for estimating heterogeneous treatment effects, geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding by observables. Standard nonlinear regression models, which may work quite well for prediction, have two notable weaknesses when used to estimate heterogeneous treatment effects. First, they can yield badly biased estimates of treatment effects when fit to data with strong confounding. The Bayesian causal forest model presented in this paper avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariate-dependent prior on the regression function. Second, standard approaches to response surface modeling do not provide adequate control over the strength of regularization over effect heterogeneity. The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively “shrink to homogeneity”. While we focus on observational data, our methods are equally useful for inferring heterogeneous treatment effects from randomized controlled experiments where careful regularization is somewhat less complicated but no less important.
I am a professor of statistics at The University of Texas McCombs School of Business. My research focuses on Bayesian statistics in high-dimensional problems with applications ranging from finance to genetics. Some of my current projects include work on causal inference, machine learning, policy evaluation and empirical asset pricing. I am also the Executive Director of the Salem Center for Policy, a unit dedicated to support research, education, and dialogue around the impact of economic policies on markets and the free enterprise system.
Thursday, 04/29/2021, Time: 11:00am – 12:15pm PSTCount Statistics for Inferring Gene-Gene Interactions and Disease-Drug Associations
Haiyan Huang, Professor
Department of Statistics, University of California, Berkeley
With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the “big data” challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. In this talk, I will introduce several gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. In an ongoing project, we develop another count-based statistic to assess “reverse” correlations, which is useful for drug discoveries. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
1. Wang YR, Waterman MS, Huang H. Gene coexpression measures in large heterogeneous samples using count statistics. Proceedings of the National Academy of Sciences. 2014 Nov 18;111(46):16371-6.
2. Wang YR, Liu K, Theusch E, Rotter JI, Medina MW, Waterman MS, Huang H. Generalized correlation measure using count statistics for gene expression data with ordered samples. Bioinformatics. 2018 Feb 15;34(4):617-24.
Haiyan Huang is Professor in the Department of Statistics at the University of California, Berkeley. She is also serving as the director for the Center for Computational Biology at UC Berkeley now. Prior to joining the faculty member of UC Berkeley in 2003, Haiyan Huang did a two years postdoc in Applied Statistics and Computational Biology at Harvard University. She obtained her Ph.D. in Applied Mathematics from the University of Southern California in 2001, and BS in Math from Peking University, China in 1997.
As an applied statistician, her research is at the interface between statistics and data-rich scientific disciplines such as biology. Over the past few decades, rapidly evolving biological technologies have generated enormous high-dimensional, complex, noisy data, presenting increasingly pressing challenges to statistical and computational science. Her group has devoted to addressing various modeling and analysis challenges from these data.
Thursday, 04/15/2021, Time: 11:00am – 12:15pm PSTSpillover Effects in Cluster Randomized Trials with Noncompliance
Luke Keele, Associate Professor
Surgery and Biostatistics, University of Pennsylvania
Cluster randomized trials (CRTs) are popular in public health and in the social sciences to evaluate a new treatment or policy where the new policy is randomly allocated to clusters of units rather than individual units. CRTs often feature both noncompliance, when individuals within a cluster are not exposed to the intervention, and individuals within a cluster may influence each other through treatment spillovers where those who comply with the new policy may affect the outcomes of those who do not. Here, we study the identification of causal effects in CRTs when both noncompliance and treatment spillovers are present. We prove that the standard analysis of CRT data with noncompliance using instrumental variables does not identify the usual complier average causal effect when treatment spillovers are present. We extend this result and show that no analysis of CRT data can unbiasedly estimate local network causal effects. Finally, we develop bounds for these causal effects under the assumption that the treatment is not harmful compared to the control. We demonstrate these results with an empirical study of a deworming intervention in Kenya.
Luke Keele (Ph.D., University of North Carolina, Chapel Hill, 2003) is currently an Associate Professor at the University of Pennsylvania with joint appointments in Surgery and Biostatistics. Professor Keele specializes in research on applied statistics with a focus on causal inference, design-based methods, matching, natural experiments and instrumental variables. He also conducts research on topics in educational program evaluation, election administration, and health services research. He has published articles in the Journal of the American Statistical Association, Annals of Applied Statistics, Journal of the Royal Statistical Society, Series A, The American Statistician, American Political Science Review, Political Analysis, and Psychological Methods.
Thursday, 04/08/2021, Time: 11:00am – 12:15pm PSTFloodgate: Inference for model-free variable importance
Lucas Janson, Assistant Professor
Department of Statistics, Harvard University
Many modern applications seek to understand the relationship between an outcome variable Y and a covariate X in the presence of confounding variables Z = (Z_1,…,Z_p). Although much attention has been paid to testing whether Y depends on X given Z, in this paper we seek to go beyond testing by inferring the strength of that dependence. We first define our estimand, the minimum mean squared error (mMSE) gap, which quantifies the conditional relationship between Y and X in a way that is deterministic, model-free, interpretable, and sensitive to nonlinearities and interactions. We then propose a new inferential approach called floodgate that can leverage any regression function chosen by the user (including those fitted by state-of-the-art machine learning algorithms or derived from qualitative domain knowledge) to construct asymptotic confidence bounds, and we apply it to the mMSE gap. In addition to proving floodgate’s asymptotic validity, we rigorously quantify its accuracy (distance from confidence bound to estimand) and robustness. We demonstrate floodgate’s performance in a series of simulations and apply it to data from the UK Biobank to infer the strengths of dependence of platelet count on various groups of genetic mutations. This is joint work with my PhD student Lu Zhang, and the associated paper can be found at https://arxiv.org/abs/2007.01283.
Lucas Janson is an Assistant Professor in the Department of Statistics and an Affiliate in Computer Science at Harvard University. He works on high-dimensional inference and statistical machine learning, and recently received the NSF CAREER Award. Dr. Janson received his PhD in Statistics from Stanford University in 2017 and was advised by Emmanuel Candès.
Thursday, 03/11/2021, Time: 11:00am – 12:15pm PSTHigh-dimensional, multiscale online changepoint detection
Richard J. Samworth, Professor of Statistical Science and Fellow of St. John’s College
Statistical Laboratory, Centre for Mathematical Sciences, University of Cambridge
We introduce a new method for high-dimensional, online changepoint detection in settings where a p-variate Gaussian data stream may undergo a change in mean. The procedure works by performing likelihood ratio tests against simple alternatives of different scales in each coordinate, and then aggregating test statistics across scales and coordinates. The algorithm is online in the sense that both its storage requirements and worst-case computational complexity per new observation are independent of the number of previous observations; in practice, it may even be significantly faster than this. We prove that the patience, or average run length under the null, of our procedure is at least at the desired nominal level, and provide guarantees on its response delay under the alternative that depend on the sparsity of the vector of mean change. Simulations confirm the practical effectiveness of our proposal, which is implemented in the R package ‘ocd’, and we also demonstrate its utility on a seismology data set.
Professor Richard Samworth obtained his PhD in Statistics from the University of Cambridge in 2004, and has remained in Cambridge since, becoming a full professor in 2013 and the Professor of Statistical Science in 2017. His main research interests are in high-dimensional and nonparametric statistics, and has developed methods and theory for shape constrained inference, changepoint estimation, data perturbation techniques (subsampling, the bootstrap, random projections, knockoffs), classification and independence testing, amongst others. He received the COPSS Presidents’ Award in 2018 and currently serves as co-editor (with Ming Yuan) of the Annals of Statistics.
Thursday, 03/04/2021, Time: 11:00am – 12:15pm PSTPoint Process Models for Sequence Detection in Neural Spike Trains
Professor Scott Linderman
Departments of Statistics and Computer Science, Stanford University
Sparse sequences of neural spikes are posited to underlie aspects of working memory, motor production, and learning. Discovering these sequences in an unsupervised manner is a longstanding problem in statistical neuroscience. I will present our new work using Neyman-Scott processes—a class of doubly stochastic point processes—to model sequences as a set of latent, continuous-time, marked events that produce cascades of neural spikes. This sparse representation of sequences opens new possibilities for spike train modeling. For example, we introduce learnable time warping parameters to model sequences of varying duration, as have been experimentally observed in neural circuits. Bayesian inference in this model requires integrating over the set of latent events, akin to inference in mixture of finite mixture (MFM) models. I will show how recent work on MFMs can be adapted to develop a collapsed Gibbs sampling algorithm for Neyman-Scott processes. Finally, I will present an empirical assessment of the model and algorithm on spike-train recordings from songbird higher vocal center and rodent hippocampus.
Scott Linderman is an Assistant Professor of Statistics and Computer Science (by courtesy) at Stanford University. He is also an Institute Scholar in the Wu Tsai Neurosciences Institute and a member of Stanford Bio-X and the Stanford AI Lab. Previously, he was a postdoctoral fellow with Liam Paninski and David Blei at Columbia University, and he completed his PhD in Computer Science at Harvard University with Ryan Adams and Leslie Valiant. Following family tradition, he slogged up Libe Slope as an undergraduate at Cornell University, just like his three brothers, parents, and a few generations of Lindermans before. Now he prefers Adirondack summers and California winters.
Friday, 02/26/2021, Time: 10:00am – 11:15pm PSTConformal Inference of Counterfactuals and Individual Treatment Effects
This is a Big Data and Machine Learning Seminar that is sponsored by the UCLA Department of Computer Science.
Lihua Lei, Postdoctoral Researcher
Department of Statistics, Stanford University
Evaluating treatment effect heterogeneity widely informs treatment decision making. At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms. While these methods enjoy some theoretical appeal in terms of consistency and convergence rates, they generally perform poorly in terms of uncertainty quantification. This is troubling since assessing risk is crucial for reliable decision-making in sensitive and uncertain environments. In this work, we propose a conformal inference-based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework. For completely randomized or stratified randomized experiments with perfect compliance, the intervals have guaranteed average coverage in finite samples regardless of the unknown data generating mechanism. For randomized experiments with ignorable compliance and general observational studies obeying the strong ignorability assumption, the intervals satisfy a doubly robust property which states the following: the average coverage is approximately controlled if either the propensity score or the conditional quantiles of potential outcomes can be estimated accurately. Numerical studies on both synthetic and real datasets empirically demonstrate that existing methods suffer from a significant coverage deficit even in simple models. In contrast, our methods achieve the desired coverage with reasonably short intervals. This is a joint work with Emmanuel Candès.
Lihua Lei is a postdoctoral researcher in Statistics at Stanford University, advised by Professor Emmanuel Candès. His current research focuses on developing rigorous statistical methodologies for uncertainty quantification in applications involving complicated decision-making processes, to enhance reliability, robustness and fairness of the system. Prior to joining Stanford, he obtained his Ph.D. in statistics at UC Berkeley, working on causal inference, multiple hypothesis testing, network analysis and stochastic optimization.
Thursday, 02/25/2021, Time: 11:00am – 12:15pm PSTA New Robust and Powerful Weighted Logrank Test
Zhiguo Li, Professor
Department of Biostatistics and Bioinformatics, Duke University
In the weighted logrank tests such as Fleming-Harrington test and the Tarone-Ware test, certain weights are used to put more weight on early, middle or late events. The purpose is to maximize the power of the test. The optimal weight under an alternative depends on the true hazard functions of the groups being compared, and thus cannot be applied directly. We propose replacing the true hazard functions with their estimates and then using the estimated weights in a weighted logrank test. However, the resulting test does not control type I error correctly because the weights converge to 0 under the null in large samples. We then adjust the estimated optimal weights for correct type I error control while the resulting test still achieves improved power compared to existing weighted logrank tests, and it is shown to be robust in various scenarios. Extensive simulation is carried out to assess the proposed method and it is applied in several clinical studies in lung cancer.
Zhiguo Li is an Associate Professor of Biostatistics & Bioinformatics at Duke University, where he is also a member of the Duke Cancer Institute.
Thursday, 02/18/2021, Time: 11:00am – 12:15pm PSTPCA, Double Descent, and Gaussian Processes
Soledad Villar, Assistant Professor
Department of Applied Mathematics & Statistics and Mathematical Institute for Data Science, Johns Hopkins University
Overparameterization in deep learning has shown to be powerful: very large models can fit the training data perfectly and yet generalize well. Investigation of overparameterization brought back the study of linear models, which, like more complex models, show a “double descent” behavior. This involves two features: (1) The risk (out-of-sample prediction error) can grow arbitrarily when the number of samples n approaches the number of parameters p (from either side), and (2) the risk decreases with p at p > n, sometimes achieving a lower value than the lowest risk at p < n. The divergence of the risk at p = n is related to the condition number of the empirical covariance in the feature set. For this reason, it can be avoided with regularization. In this work we show that performing a PCA-based dimensionality reduction also avoids the divergence at p = n; we provide a finite upper bound for the variance of the estimator that decreases with p. This result contrasts with recent work that shows that a different form of dimensionality reduction—one based on the population covariance instead of the empirical covariance—does not avoid the divergence. We connect these results to an analysis of adversarial attacks, which become more effective as they raise the condition number of the empirical covariance of the features. We show that ordinary least squares is arbitrarily susceptible to data-poisoning attacks in the overparameterized regime—unlike the underparameterized regime—and how regularization and dimensionality reduction improve its robustness. We also translated the results on the highly overparameterized linear regression regime to Gaussian Processes.
Soledad Villar is an Assistant Professor in Applied Mathematics and Statistics at Johns Hopkins University. Her research focuses on mathematical data science. In particular, she is interested in optimization algorithms arising from data applications and machine learning, as well as graph neural networks. She received a PhD in Mathematics from the University of Texas at Austin in 2017 and after that she held research positions at the Simons Institute in UC Berkeley, and the Center for Data Science at NYU. Her research is sponsored by NSF, The Simons Foundation, and EOARD. Soledad is originally from Uruguay.
Thursday, 02/11/2021, Time: 11:00am – 12:15pm PSTOnline Hyperparameter Optimization by Real-time Recurrent Learning
Kyunghyun Cho, Associate Professor
Computer Science (Courant Institute) and Center for Data Science, New York University
Conventional hyperparameter optimization methods are computationally intensive and hard to generalize to scenarios that require dynamically adapting hyperparameters, such as life-long learning. Here, we propose an online hyperparameter optimization algorithm that is asymptotically exact and computationally tractable, both theoretically and practically. Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in recurrent neural networks (RNNs). It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously, without repeatedly rolling out iterative optimization. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time. (This is work done with Daniel Jiwoong Im and Cristina Savin.)
Kyunghyun Cho is an associate professor of computer science and data science at New York University and CIFAR Fellow of Learning in Machines & Brains. He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio, after receiving PhD and MSc degrees from Aalto University April 2011 and April 2014, respectively, under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.
Thursday, 02/04/2021, Time: 11:00am – 12:15pm PSTTesting Goodness-of-fit and Conditional Independence with Approximate Co-sufficient Sampling
Rina Foygel Barber, Professor of Statistics
University of Chicago
Goodness-of-fit (GoF) testing is ubiquitous in statistics, with direct ties to model selection, confidence interval construction, conditional independence testing, and multiple testing, just to name a few applications. While testing the GoF of a simple (point) null hypothesis provides an analyst great flexibility in the choice of test statistic while still ensuring validity, most GoF tests for composite null hypotheses are far more constrained, as the test statistic must have a tractable distribution over the entire null model space. A notable exception is co-sufficient sampling (CSS): resampling the data conditional on a sufficient statistic for the null model guarantees valid GoF testing using any test statistic the analyst chooses. But CSS testing requires the null model to have a compact (in an information-theoretic sense) sufficient statistic, which only holds for a very limited class of models; even for a null model as simple as logistic regression, CSS testing is powerless. In this paper, we leverage the concept of approximate sufficiency to generalize CSS testing to essentially any parametric model with an asymptotically-efficient estimator; we call our extension “approximate CSS” (aCSS) testing. We quantify the finite-sample Type I error inflation of aCSS testing and show that it is vanishing under standard maximum likelihood asymptotics, for any choice of test statistic. We apply our proposed procedure both theoretically and in simulation to a number of models of interest to demonstrate its finite-sample Type I error and power.
This work is joint with Lucas Janson.
Thursday, 01/21/2021, Time: 11:00am – 12:15pm PSTPoisson and Marked Poisson Processes in 3D Imaging and Microscopy
Vivek Goyal, Professor
Electrical and Computer Engineering, Boston University
Detectors that are capable of sensing a single photon are no longer rare. They are used for 3D imaging on the iPad Pro and in many autonomous vehicles and mobile devices. Similarly, direct electron detection is used in particle-beam microscopy. Detections of such discrete particles are naturally modeled with stochastic arrival processes. This talk will focus on how Poisson process (PP) models for arrivals can be exploited to improve imaging. In lidar, when detector dead times are insignificant, these models can be used directly and lead to accurate depth and reflectivity imaging with as few as one detected photon per pixel. Furthermore, when significant dead times create statistical dependencies, Markov chain modeling can mitigate bias. Also, in focused ion beam microscopy, a Poisson-marked PP model inspires a new way to acquire and interpret the data. In both applications, principled statistical models lead to significant imaging improvements.
Most relevant papers:
Vivek Goyal received his doctoral degree in electrical engineering from the University of California, Berkeley. He was a Member of Technical Staff at Bell Laboratories, a Senior Research Engineer for Digital Fountain, and the Esther and Harold E. Edgerton Associate Professor of Electrical Engineering at MIT. He was an adviser to 3dim Tech, winner of the 2013 MIT $100K Entrepreneurship Competition Launch Contest Grand Prize, and consequently with Google/Alphabet Nest Labs 2014-2016. He is now a Professor of Electrical and Computer Engineering at Boston University. Dr. Goyal is a Fellow of the IEEE and of the OSA, and he and his students have been awarded ten IEEE paper awards and seven thesis awards. He is a co-author of Foundations of Signal Processing (Cambridge University Press, 2014).
Thursday, 01/14/2021, Time: 11:00am – 12:15pm PSTStatistical Limits for the Matrix Tensor Product
Galen Reeves, Associate Professor
Departments of Statistical Science and of Electrical and Computer Engineering, Duke University
High-dimensional models involving the products of large random matrices include the spiked matrix models appearing in principle component analysis and the stochastic block model appearing in network analysis. In this talk I will present some recent theoretical work that provides an asymptotically exact characterization of the fundamental limits of inference for a broad class of these models. The first part of the talk will introduce the “matrix tensor product” model and describe some implications of the theory for community detection in correlated networks. The second part will highlight some of the ideas in the analysis, which builds upon ideas from information theory and statistical physics.
The material in this talk is appears in following papers:
Information-Theoretic Limits for the Matrix Tensor Product, Galen Reeves
Mutual Information in Community Detection with Covariate Information and Correlated Networks, Vaishakhi Mayya and Galen Reeves
Galen Reeves joined the faculty at Duke University in Fall 2013, and is currently an Associate Professor with a joint appointment in the Department of Electrical Computer Engineering and the Department of Statistical Science. He completed his PhD in Electrical Engineering and Computer Sciences at the University of California, Berkeley in 2011, and he was a postdoctoral associate in the Departments of Statistics at Stanford University from 2011 to 2013. His research interests include information theory and high-dimensional statistics. He received the NSF CAREER award in 2017.
Thursday, 01/07/2021, Time: 11:00am – 12:15pm PSTMathematical Perspectives in the Emergence of Physics
Dr. Alex Ely Kossovsky
This presentation briefly explores humanity’s first major scientific achievement, namely the discovery of modern physics during the late Renaissance era, and demonstrates the decisive role of mathematics and rudimentary data analysis in facilitating this multigenerational accomplishment. In particular, the inspirational history of how mathematical advances like logarithms—discovered in the early 1600s—paved the way for this remarkable scientific advance in physics shall be explored, detailing how logarithms led Kepler to the discovery of his Third Law by facilitating arithmetical computations and by hinting at power-law relationships. Kepler’s planetary statistical discovery of his Third Law relates the square of the time period for one full orbit around the sun to the cube of the planet’s distance from the sun, namely Period^2 = K*Distance^3, and this remarkable discovery was courageously based on merely six data points, corresponding to the periods and distances of the six planets known at that era. In addition, the presentation shall briefly explore the role of Newton in midwifing the birth of science with his grand synthesis of Kepler’s celestial data analysis and Galileo’s terrestrial experiments. Lastly, the rise and fall of Bodes’ Law shall be examined, and its ambitious but failed attempt to fit the orbital distances of the planets into an exact mathematical expression shall be presented as an illustrative example of the inability to apply rigid and exact mathematical formulas to probabilistic and chanced events, such as the chaotic process of star and planet formation from the random distribution in space of gas and dust particles into much larger entities via the force of gravity.
Reference book: “The Birth of Science”, Springer Nature Publishing, by Alex Ely Kossovsky, Aug 2020, ISBN-10: 3030517438.
Alex Ely Kossovsky is the inventor of a patented mathematical algorithm used in data fraud detection analysis, and he is considered by some to be the world’s leading expert on the topic of Benford’s Law. He is the author of three books on Benford’s Law as well as a more recent book titled “The Birth of Science.” He specialized in Applied Mathematics and Statistics at the City University of New York and in Physics and Pure Mathematics at the State University of New York at Stony Brook.
Tuesday, 12/08/2020, Time: 11:00am – 12:15pm PSTBayesian modeling of global viral diffusions at scale
Andrew J. Holbrook, Assistant Professor of Biostatistics
I develop a Bayesian hierarchical model to infer the rate at which different seasonal influenza virus subtypes travel across global transportation networks. Data take the form of 5,392 viral sequences and their associated 14 million pairwise distances arising from the annual number of commercial airline seats between viral sampling locations. To adjust for shared evolutionary history of the viruses, I implement a phylogenetic extension to the Bayesian multidimensional scaling model and learn that subtype H3N2 spreads most effectively, consistent with its epidemic success relative to other seasonal influenza subtypes.
Dr. Andrew J. Holbrook is Assistant Professor at the UCLA Department of Biostatistics and has research interests in scalable Bayesian inference for applications in neural decoding and viral epidemiology. Andrew graduated from UC Berkeley in 2009 with a B.A. in German and Classical Languages. In 2018, he received a Ph.D. in Statistics from UC Irvine, where he completed his dissertation, “Geometric Bayes”, an investigation into the intersections of differential geometry and applied Bayesian inference. For this work, Andrew won honorable mention for the 2019 Leonard J. Savage Award in Theory and Methods, awarded by the International Society for Bayesian Analysis. Most recently Andrew received an NIH (K) Career Development Award to develop high-performance computing methods and model the global spread of viruses in a Big Data context.
Tuesday, 12/01/2020, Time: 11:00am – 12:15pm PSTNarrowest Significance Pursuit: Inference for multiple change-points in linear models
Piotr Fryzlewicz, Professor of Statistics
London School of Economics
We propose Narrowest Significance Pursuit (NSP), a general and flexible methodology for automatically detecting localised regions in data sequences which each must contain a change-point, at a prescribed global significance level. Here, change-points are understood as abrupt changes in the parameters of an underlying linear model. NSP works by fitting the postulated linear model over many regions of the data, using a certain multiresolution sup-norm loss, and identifying the shortest interval on which the linearity is significantly violated. The procedure then continues recursively to the left and to the right until no further intervals of significance can be found. The use of the multiresolution sup-norm loss is a key feature of NSP, as it enables the transfer of significance considerations to the domain of the unobserved true residuals, a substantial simplification. It also guarantees important stochastic bounds which directly yield exact desired coverage probabilities, regardless of the form or number of the regressors. NSP works with a wide range of distributional assumptions on the errors, including Gaussian with known or unknown variance, some light-tailed distributions, and some heavy-tailed, possibly heterogeneous distributions via self-normalisation. It also works in the presence of autoregression. The mathematics of NSP is, by construction, uncomplicated, and its key computational component uses simple linear programming. In contrast to the widely studied “post-selection inference” approach, NSP enables the opposite viewpoint and paves the way for the concept of “post-inference selection”. Pre-CRAN R code implementing NSP is available at https://github.com/pfryz/nsp.
The paper is available from: https://arxiv.org/abs/2009.05431
Piotr is with the Department of Statistics at the London School of Economics, UK. He obtained his PhD from the University of Bristol (2003) in wavelets in statistics, and spent a brief period of time in the late 00’s working in the finance industry. He has recently mainly been working on change point detection and inference, and on non-stationary time series. He is a former editor of the Journal of the Royal Statistical Society Series B, and a keen statistical consultant. The paper he is going to discuss today is a small milestone as it is Piotr’s 50th journal-length paper.
Tuesday, 11/24/2020, Time: 11:00am – 12:15pm PSTChallenges in Developing Learning Algorithms to Personalize Treatment in Real Time
Susan Murphy, Professor of Statistics
There are a variety of formidable challenges to reinforcement learning and control for use in designing digital health interventions for individuals with chronic disorders. Challenges include settings in which most treatments delivered by a smart device have immediate nonnegative (hopefully positive) effects but the largest longer term effects tend to be negative due to user burden. Furthermore the resulting data must be amenable to conducting a variety of statistical analyses, including causal inference as well as for use in monitoring analyses. Other challenges include an immature domain science concerning the system dynamics yet the need to incorporate some domain science due to low signal to noise ratio as well as non-stationary and sparse data. Here we describe how we confront these challenges including our use of low variance proxies for the delay effects to the reward (e.g. immediate response) in an online “bandit” learning algorithm for use in personalizing mobile health interventions.
Dr. Susan A. Murphy is a Radcliffe Alumnae Professor at the Radcliffe Institute and a professor of statistics and computer science at the Harvard John A. Paulson School of Engineering and Applied Sciences. A 2013 recipient of a MacArthur Fellowship, she was previously the H. E. Robbins Distinguished University Professor of Statistics, a research professor at the Institute for Social Research, and a professor of psychiatry, all at the University of Michigan.
Dr. Murphy earned her BS from Louisiana State University and her PhD from the University of North Carolina at Chapel Hill. Her research focuses on analytic methods to design and evaluate medical treatments that adapt to individuals, including some that use mobile devices to deliver tailored interventions for drug addicts, smokers, and heart disease patients, among others. She is a member of the National Academy of Medicine and of the National Academy of Sciences.
Tuesday, 11/17/2020, Time: 11:00am – 12:15pm PSTMeditations on mediation
Sihai Dave Zhao, Assistant Professor of Statistics
University of Illinois at Urbana-Champaign
Mediation analysis studies the extent to which the effect of an exposure on an outcome is mediated by intervening variables. It has recently become extremely popular in genomics, where its application has raised several interesting new statistical questions. I will describe a few results of our study of some of these questions: a surprising property of hypothesis testing in mediation models, estimation and inference for high-dimensional mediators, and connections with surrogate markers. A relevant paper is:
Dr. Sihai Dave Zhao is an Associate Professor in the Department of Statistics at the University of Illinois at Urbana-Champaign. His research interests include shrinkage estimation, multiple testing, integrative genomics, and spatial transcriptomics. Dr. Zhao received an A.B. in Chemistry and Physics and a Ph.D. in Biostatistics from Harvard University before completing a postdoctoral fellowship in Biostatistics and Statistics at the University of Pennsylvania.
Tuesday, 11/10/2020, Time: 11:00am – 12:15pm PSTDetecting changes in multivariate extremes from climatological time series
Prof. Philippe Naveau, Research Scientist
Laboratoire des Sciences du Climat et l’Environnement (LSCE) CNRS, France
Joint work with Sebastian Engelke (Geneva University) and Chen Zhou (Erasmus University Rotterdam)
Many effects of climate change seem to be reflected not in the mean temperatures, precipitation or other environmental variables, but rather in the frequency and severity of the extreme events in the distributional tails. The most serious climate-related disasters are caused by compound events that result from an unfortunate combination of several variables. Detecting changes in size or frequency of such compound events requires a statistical methodology that efficiently uses the largest observations in the sample.
We propose a simple, non-parametric test that decides whether two multivariate distributions exhibit the same tail behavior. The test is based on the entropy, namely Kullback–Leibler divergence, between exceedances over a high threshold of the two multivariate random vectors. We study the properties of the test and further explore its effectiveness for finite sample sizes.
Our main application is the analysis of daily heavy rainfall times series in France (1976 -2015). Our goal in this application is to detect if multivariate extremal dependence structure in heavy rainfall change according to seasons and regions.
Dr. Naveau is a Research Scientist at the “Laboratoire des Sciences du Climat et de l’Environnement”, IPSL-CNRS, France. His research interests include Statistical climatology and hydrology, extreme value theory, time series analysis, and spatial statistics. Dr. Naveau received Ph.D. in statistics from Colorado State University in 1998 under the supervision of Profs. R. Tweedie and R. Davis. He is the Associate Editor of many statistical and climate research journals including the Annals of Applied Statistics. He is a prolific scholar with many highly-cited publications (https://scholar.google.com/citations?user=yMvy9NIAAAAJ&hl=en).
Tuesday, 11/03/2020, Time: 3:30pm – 4:30pm PST Exploring the complexity of the transcriptome using RNA-seq
Professor Alicia Oshlack, Group Leader
Peter MacCallum Cancer Centre
This is a special seminar co-listed with the Bioinformatics/Human-Genetics seminar series. Please note the special time due to the time difference between Los Angeles and Melbourne.
In this talk, Dr. Oshlack will introduce data-summary and data-mining methods her group developed for RNA sequencing (RNA-seq) data. In the beginning, she will introduce the basics of molecular biology and the generation of RNA-seq data.
Her talk will focus on the following two papers:
“Using equivalence class counts for fast and accurate testing of differential transcript usage” https://f1000research.com/articles/8-265/v2
“MINTIE: identifying novel structural and splice variants in transcriptomes using RNA-seq data” https://www.biorxiv.org/content/10.1101/2020.06.03.131532v2
Dr. Alicia Oshlack is an Australian bioinformatician and is Co-Head of Computational Biology at the Peter MacCallum Cancer Centre in Melbourne, Victoria, Australia. She is best known for her work developing methods for the analysis of transcriptome data as a measure of gene expression. She has characterized the role of gene expression in human evolution by comparisons of humans, chimpanzees, orangutans, and rhesus macaques, and works collaboratively in data analysis to improve the use of clinical sequencing of RNA samples by RNA-seq for human disease diagnosis.
Dr. Oshlack completed a Bachelor of Science (Hons) (1994–98) from the University of Melbourne, majoring in physics. She remained at the University of Melbourne to complete a PhD in astrophysics, which she completed on the topic of the central structure of radio quasars (1999-2003). Dr. Oshlack made a career transition to apply her mathematics to genetics after moving to the Walter and Eliza Hall Institute, where she worked as a research officer (2003–07) and then senior research officer (2007–11) in the Bioinformatics Division. Dr. Oshlack moved to the Murdoch Children’s Research Institute in Melbourne in 2011 to take up the post of Head of Bioinformatics. She was appointed as the co-chair of the Genomics and Bioinformatics advisory group for The Melbourne Genomics Health Alliance in 2013. She was also on the organising committee of Beyond the Genome in 2013. In 2019 Dr. Oshlack was appointed co-Head of Computational Biology at the Peter MacCallum Cancer Centre.
Tuesday, 10/27/2020, Time: 11am – 12:15pm PSTWhy multi-omics is an interesting statistics problem; and how to use side-information to improve the power of multiple testing schemes
Prof. Wolfgang Huber, Group Leader and Senior Scientist
In this talk, Dr. Huber will talk about two methods: Independent Hypothesis Weighting (IHW) and Multi-Omics Factor Analysis (MOFA).
For IHW, please see:
mathematical statistics presentation: https://arxiv.org/abs/1701.05179
bioinformatics presentation: https://www.nature.com/articles/nmeth.3885
software implementation: http://bioconductor.org/packages/release/bioc/vignettes/IHW/inst/doc/introduction_to_ihw.html
For MOFA, please see https://www.embopress.org/doi/full/10.15252/msb.20178124
Dr. Wolfgang Huber’s interests lie in computational biology and statistical computing, and comprise method development as well as biological discovery and translation into clinical research. He collaborates with leading experimental groups in genetics and cancer research on the enabling of new, computationally intensive types of experiments and studies.
Dr. Huber is a founding member of Bioconductor, which started in 2001 and continues to be one of the largest bioinformatics projects. He coordinated the EC H2020 network SOUND (2015-18), serves on several Scientific Advisory Boards and consults for bioinformatics and pharmaceutical companies. He has authored >160 peer-reviewed publications.
Dr. Huber studied physics at the University of Freiburg, including an Erasmus year at the University of Edinburgh. He obtained a PhD in theoretical physics on stochastic models and simulation of open quantum systems. He moved to California in 1998 to do postdoctoral research in cheminformatics of small, drug-like compounds at IBM Research Almaden in San José. In 2000, his interest in cancer genomics and microarray analysis led him to the German Cancer Research Centre (DKFZ) in Heidelberg. In 2004, he joined EMBL to start a research group at its European Bioinformatics Institute (EBI) in Cambridge. In 2009, he took up a position in the newly formed Genome Biology unit of EMBL in Heidelberg, and in 2011 became EMBL Senior Scientist.
Tuesday, 10/20/2020, Time: 11am – 12:15pm PSTTopic: How to incorporate personal densities into predictive models: Pairwise Density Distances, Regularized Kernel Estimation and Smoothing Spline ANOVA models
Prof. Grace Wahba, I.J. Schoenberg-Hilldale Emerita Professor of Statistics
University of Wisconsin-Madison
We are concerned with the use of personal density functions or personal sample densities as subject attributes in prediction and classification models. The situation is particularly interesting when it is desired to combine other attributes with the personal densities in a prediction or classification model.
The procedure is (for each subject) to embed their sample density into a Reproducing Kernel Hilbert Space (RKHS), use this embedding to estimate pairwise distances between densities, use Regularized Kernel Estimation (RKE) with the pairwise distances to embed the subject (training) densities into a Euclidean space, and use the Euclidean coordinates as attributes in a Smoothing Spline ANOVA (SSANOVA) model. Elementary expository introductions to RKHS, RKE and SSANOVA occupy most of this talk.
Dr. Grace Wahba is an American statistician and now-retired I. J. Schoenberg-Hilldale Professor of Statistics at the University of Wisconsin–Madison. She is a pioneer in methods for smoothing noisy data. Best known for the development of generalized cross-validation and “Wahba’s problem,” she has developed methods with applications in demographic studies, machine learning, DNA microarrays, risk modeling, medical imaging, and climate prediction.
Dr. Wahba is a member of the National Academy of Sciences and a fellow of several academic societies including the American Academy of Arts and Sciences, the American Association for the Advancement of Science, the American Statistical Association, and the Institute of Mathematical Statistics. Over the years she has received a selection of notable awards in the statistics community:
– R. A. Fisher Lectureship, COPSS, August 2014
– Gottfried E. Noether Senior Researcher Award, Joint Statistics Meetings, August 2009
– Committee of Presidents of Statistical Societies Elizabeth Scott Award, 1996
– First Emanuel and Carol Parzen Prize for Statistical Innovation, 1994
Tuesday, 10/13/2020, Time: 11am – 12:15pm PSTReflections on Breiman’s Two Cultures of Statistical Modeling & An updated dynamic Bayesian forecasting model for the 2020 election
Andrew Gelman, Professor of Statistics
Abstract: In this talk, Dr. Gelman will talk about two papers:
In an influential paper from 2001, the statistician Leo Breiman distinguished between two cultures in statistical modeling: “One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown.” Breiman’s “two cultures” article deserves its fame: it includes many interesting real-world examples and an empirical perspective which is a breath of fresh air compared to the usual standard approach of statistics papers at that time, which was a mix of definitions, theorems, and simulation studies showing the coverage of nominal 95% confidence intervals.
We constructed an election forecasting model for the Economist that builds on Linzer’s (2013) dynamic Bayesian forecasting model and provides an election day forecast by partially pooling two separate predictions: (1) a forecast based on historically relevant economic and political factors such as personal income growth, presidential approval, and incumbency; and (2) information from state and national polls during the election season. The two sources of information are combined using a time-series model for state and national opinion. Our model also accounts for some aspects of non-sampling errors in polling. The model is fit using the open-source statistics packages R and Stan (R Core Team, 2020; Stan Development Team, 2020) and is updated every day with new polls. The forecast is available at https://projects.economist.com/us-2020-forecast/president, a description of the model-building process is at https://projects.economist.com/us-2020-forecast/president/how-this-works, and all code is at https://github.com/TheEconomist/us-potus-model.
Dr. Andrew Gelman is a professor of statistics and political science at Columbia University. He has received the Outstanding Statistical Application award three times from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. His books include Bayesian Data Analysis (with John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin), Teaching Statistics: A Bag of Tricks (with Deb Nolan), Data Analysis Using Regression and Multilevel/Hierarchical Models (with Jennifer Hill), Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do (with David Park, Boris Shor, and Jeronimo Cortina), A Quantitative Tour of the Social Sciences (co-edited with Jeronimo Cortina), and Regression and Other Stories (with Jennifer Hill and Aki Vehtari). Dr. Gelman has done research on a wide range of topics, including: why it is rational to vote; why campaign polls are so variable when elections are so predictable; why redistricting is good for democracy; reversals of death sentences; police stops in New York City, the statistical challenges of estimating small effects; the probability that your vote will be decisive; seats and votes in Congress; social network structure; arsenic in Bangladesh; radon in your basement; toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.
Tuesday, 10/06/2020, Time: 11am – 12:00pmTesting for a Change in Mean after Changepoint Detection
Prof. Paul Fearnhead, Distinguished Professor of Statistics
While many methods are available to detect structural changes in a time series, few procedures are available to quantify the uncertainty of these estimates post-detection. In this work, we fill this gap by proposing a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. We further show that it is possible to efficiently carry out this framework in the case of changepoints estimated by binary segmentation, variants of binary segmentation, segmentation, or the fused lasso. Our setup allows us to condition on much smaller selection events than existing approaches, which yields higher powered tests. Our procedure leads to improved power in simulation and additional discoveries in a dataset of chromosomal guanine-cytosine content. Our new changepoint inference procedures are freely available in the R package ChangepointInference. This is joint work with Sean Jewell and Daniela Witten.
This is based on the paper: https://arxiv.org/pdf/1910.04291 and motivated by earlier work: https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxy083/5310127
Dr. Paul Fearnhead is Distinguished Professor of Statistics at Lancaster University. He is a researcher in computational statistics, in particular Sequential Monte Carlo methods. His interests include sampling theory and genetics – he has published several papers working on the epidemiology of campylobacter by looking at recombination events in a large sample of genomes. Since January 2018 he has been the editor of Biometrika. He won the Adams Prize and the Guy Medal in Bronze of the Royal Statistical Society in 2007.
Tuesday, 09/29/2020, Time: 11am – 12:15pmIndividual-centered Partial Information in Social Networks
Prof. Xin Tong, Assistant Professor
Data Sciences and Operations
University of Southern California
Most existing statistical network analysis literature assumes a global view of the network, under which community detection, testing, and other statistical procedures are developed. Yet in the real world, people frequently make decisions based on their partial understanding of the network information. As individuals barely know beyond friends’ friends, we assume that an individual of interest knows all paths of length up to L = 2 that originate from her. As a result, this individual’s perceived adjacency matrix B differs significantly from the usual adjacency matrix A based on the global information. The new individual-centered partial information framework sparks an array of interesting endeavors from theory to practice. Key general properties on the eigenvalues and eigenvectors of BE , a major term of B, are derived. These general results, coupled with the classic stochastic block model, lead to a new theory-backed spectral approach to detect the community memberships based on an anchored individual’s partial information. Real data analysis delivers interesting insights that cannot be obtained from global network analysis.
Dr. Xin Tong is an assistant professor at the Department of Data Sciences and Operations, University of Southern California. His research focuses on asymmetric supervised and unsupervised learning, high-dimensional statistics, and network-related problems. He is an associate editor for Journal of the American Statistical Association and Journal of Business and Economic Statistics. Before joining the current position, he was an instructor of statistics in the Department of Mathematics at the Massachusetts Institute of Technology. He obtained his Ph.D. in Operations Research from Princeton University.
Thursday, 09/24/2020, Time: 11am – 12:15pmActive Learning and Projects with Engaging Contexts: Keys to Successful Teaching of Applied Statistics Face-to-Face and Remotely
Mahtash Esfandiari, Senior Lecturer
Stephanie Stacy, Graduate Student
Samuel Baugh, Graduate Student
UCLA Department of Statistics
The objective of this presentation is to: 1) elaborate a theoretical model that underlies successful teaching of applied statistics face-to-face and remotely, 2) explain the teaching and learning, strategies that helped us reach the objectives underlying the proposed model, and 3) describe the evaluation strategies we used to assess the extent to which we reached our objectives in face-to-face and remote teaching. Sample projects as well as with quantitative and qualitative findings will be presented from courses on “Introduction to Statistical Consulting” and “Linear Models” to elaborate the application of the theoretical model proposed to teaching of applied statistics pre-pandemic face-to-face and post-pandemic on zoom.