BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Ery Arias-Castro (UC San Diego)
DTSTART:20200417T150000Z
DTEND:20200417T161200Z
DTSTAMP:20260602T225836Z
UID:sss/1
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/1/">On u
 sing graph distances to estimate Euclidean and related distances</a>\nby E
 ry Arias-Castro (UC San Diego) as part of Stochastics and Statistics Semin
 ar Series\n\n\nAbstract\nGraph distances have proven quite useful in machi
 ne learning/statistics\, particularly in the estimation of Euclidean or ge
 odesic distances. The talk will include a partial review of the literature
 \, and then present more recent developments on the estimation of curvatur
 e-constrained distances on a surface\, as well as on the estimation of Euc
 lidean distances based on an unweighted and noisy neighborhood graph.\n
LOCATION:https://researchseminars.org/talk/sss/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sébastien Bubeck (Microsoft Research)
DTSTART:20200424T150000Z
DTEND:20200424T160000Z
DTSTAMP:20260602T225836Z
UID:sss/2
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/2/">How 
 to Trap a Gradient Flow</a>\nby Sébastien Bubeck (Microsoft Research) as 
 part of Stochastics and Statistics Seminar Series\n\n\nAbstract\nIn 1993\,
  Stephen A. Vavasis proved that in any finite dimension\, there exists a f
 aster method than gradient descent to find stationary points of smooth non
 -convex functions. In dimension 2 he proved that 1/eps gradient queries ar
 e enough\, and that 1/sqrt(eps) queries are necessary. We close this gap b
 y providing an algorithm based on a new local-to-global phenomenon for smo
 oth non-convex functions. Some higher dimensional results will also be dis
 cussed. I will also present an extension of the 1/sqrt(eps) lower bound to
  randomized algorithms\, mainly as an excuse to discuss some beautiful top
 ics such as Aldous’ 1983 paper on local minimization on the cube\, and B
 enjamini-Pemantle-Peres’ 1998 construction of unpredictable walks.\n\nJo
 int work with Dan Mikulincer\n
LOCATION:https://researchseminars.org/talk/sss/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Alexandre d'Aspremont (ENS\, CNRS)
DTSTART:20200501T150000Z
DTEND:20200501T160000Z
DTSTAMP:20260602T225836Z
UID:sss/3
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/3/">Naiv
 e feature selection: Sparsity in naive Bayes</a>\nby Alexandre d'Aspremont
  (ENS\, CNRS) as part of Stochastics and Statistics Seminar Series\n\n\nAb
 stract\nDue to its linear complexity\, naive Bayes classification remains 
 an attractive supervised learning method\, especially in very large-scale 
 settings. We propose a sparse version of naive Bayes\, which can be used f
 or feature selection. This leads to a combinatorial maximum-likelihood pro
 blem\, for which we provide an exact solution in the case of binary data\,
  or a bound in the multinomial case. We prove that our bound becomes tight
  as the marginal contribution of additional features decreases. Both binar
 y and multinomial sparse models are solvable in time almost linear in prob
 lem size\, representing a very small extra relative cost compared to the c
 lassical naive Bayes. Numerical experiments on text data show that the nai
 ve Bayes feature selection method is as statistically effective as state-o
 f-the-art feature selection methods such as recursive feature elimination\
 , l1-penalized logistic regression and LASSO\, while being orders of magni
 tude faster. For a large data set\, having more than with 1.6 million trai
 ning points and about 12 million features\, and with a non-optimized CPU i
 mplementation\, our sparse naive Bayes model can be trained in less than 1
 5 seconds.  Authors: A. Askari\, A. d’Aspremont\, L. El Ghaoui.\n
LOCATION:https://researchseminars.org/talk/sss/3/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gesine Reinert (University of Oxford)
DTSTART:20200911T150000Z
DTEND:20200911T160000Z
DTSTAMP:20260602T225836Z
UID:sss/4
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/4/">Stei
 n’s method for multivariate continuous distributions and applications</a
 >\nby Gesine Reinert (University of Oxford) as part of Stochastics and Sta
 tistics Seminar Series\n\n\nAbstract\nStein’s method is a key method for
  assessing distributional distance\, mainly for one-dimensional distributi
 ons. In this talk we provide a general approach to Stein’s method for mu
 ltivariate continuous distributions. Among the applications we consider is
  the Wasserstein distance between two continuous probability distributions
  under the assumption of existence of a Poincare constant.\n\nThis is join
 t work with Guillaume Mijoule (INRIA Paris) and Yvik Swan (Liege).\n
LOCATION:https://researchseminars.org/talk/sss/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Caroline Uhler (MIT)
DTSTART:20200918T150500Z
DTEND:20200918T160500Z
DTSTAMP:20260602T225836Z
UID:sss/5
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/5/">Caus
 al Inference and Overparameterized Autoencoders in the Light of Drug Repur
 posing for SARS-CoV-2</a>\nby Caroline Uhler (MIT) as part of Stochastics 
 and Statistics Seminar Series\n\n\nAbstract\nMassive data collection holds
  the promise of a better understanding of complex phenomena and ultimately
 \, of better decisions. An exciting opportunity in this regard stems from 
 the growing availability of perturbation / intervention data (drugs\, knoc
 kouts\, overexpression\, etc.) in biology. In order to obtain mechanistic 
 insights from such data\, a major challenge is the development of a framew
 ork that integrates observational and interventional data and allows predi
 cting the effect of yet unseen interventions or transporting the effect of
  interventions observed in one context to another. I will present a framew
 ork for causal inference based on such data and particularly highlight the
  role of overparameterized autoencoders. We end by demonstrating how these
  ideas can be applied for drug repurposing in the current SARS-CoV-2 crisi
 s.\n
LOCATION:https://researchseminars.org/talk/sss/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Dylan Foster (MIT)
DTSTART:20200925T150500Z
DTEND:20200925T160500Z
DTSTAMP:20260602T225836Z
UID:sss/6
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/6/">Sepa
 rating Estimation from Decision Making in Contextual Bandits</a>\nby Dylan
  Foster (MIT) as part of Stochastics and Statistics Seminar Series\n\n\nAb
 stract\nThe contextual bandit is a sequential decision making problem in w
 hich a learner repeatedly selects an action (e.g.\, a news article to disp
 lay) in response to a context (e.g.\, a user’s profile) and receives a r
 eward\, but only for the action they selected. Beyond the classic explore-
 exploit tradeoff\, a fundamental challenge in contextual bandits is to dev
 elop algorithms that can leverage flexible function approximation to model
  similarity between contexts\, yet have computational requirements compara
 ble to classical supervised learning tasks such as classification and regr
 ession. To this end\, we provide the first universal and optimal reduction
  from contextual bandits to online regression. We show how to transform an
 y oracle for online regression with a given value function class into an a
 lgorithm for contextual bandits with the induced policy class\, with no ov
 erhead in runtime or memory requirements. Conceptually\, our results show 
 that it is possible to provably separate estimation and decision making in
 to separate algorithmic building blocks\, and that this can be effective b
 oth in theory and in practice. Time permitting\, I will discuss extensions
  of these techniques to more challenging reinforcement learning problems.\
 n
LOCATION:https://researchseminars.org/talk/sss/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Richard Nickl (University of Cambridge)
DTSTART:20201002T150500Z
DTEND:20201002T160500Z
DTSTAMP:20260602T225836Z
UID:sss/7
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/7/">Baye
 sian inverse problems\, Gaussian processes\, and partial differential equa
 tions</a>\nby Richard Nickl (University of Cambridge) as part of Stochasti
 cs and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gábor Lugosi (Pompeu Fabra University)
DTSTART:20201009T150500Z
DTEND:20201009T160500Z
DTSTAMP:20260602T225836Z
UID:sss/8
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/8/">On E
 stimating the Mean of a Random Vector</a>\nby Gábor Lugosi (Pompeu Fabra 
 University) as part of Stochastics and Statistics Seminar Series\n\nAbstra
 ct: TBA\n
LOCATION:https://researchseminars.org/talk/sss/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Carola-Bibiane Schönlieb (University of Cambridge)
DTSTART:20201016T150500Z
DTEND:20201016T160500Z
DTSTAMP:20260602T225836Z
UID:sss/9
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/9/">Data
  driven variational models for solving inverse problems</a>\nby Carola-Bib
 iane Schönlieb (University of Cambridge) as part of Stochastics and Stati
 stics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jose Blanchet (Stanford University)
DTSTART:20201023T150500Z
DTEND:20201023T160500Z
DTSTAMP:20260602T225836Z
UID:sss/10
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/10/">Sta
 tistical Aspects of Wasserstein Distributionally Robust Optimization Estim
 ators</a>\nby Jose Blanchet (Stanford University) as part of Stochastics a
 nd Statistics Seminar Series\n\n\nAbstract\nAbstract: Wasserstein-based di
 stributional robust optimization problems are formulated as min-max games 
 in which a statistician chooses a parameter to minimize an expected loss a
 gainst an adversary (say nature) which wishes to maximize the loss by choo
 sing an appropriate probability model within a certain non-parametric clas
 s. Recently\, these formulations have been studied in the context in which
  the non-parametric class chosen by nature is defined as a Wasserstein-dis
 tance neighborhood around the empirical measure. It turns out that by appr
 opriately choosing the loss and the geometry of the Wasserstein distance o
 ne can recover a wide range of classical statistical estimators (including
  Lasso\, Graphical Lasso\, SVM\, group Lasso\, among many others). This ta
 lk studies a wide range of rich statistical quantities associated with the
 se problems\; for example\, the optimal (in a certain sense) choice of the
  adversarial perturbation\, weak convergence of natural confidence regions
  associated with these formulations\, and asymptotic normality of the DRO 
 estimators. (This talk is based on joint work with Y. Kang\, K. Murthy\, a
 nd N. Si.)\n
LOCATION:https://researchseminars.org/talk/sss/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Daniela Witten (University of Washington)
DTSTART:20201106T160500Z
DTEND:20201106T170500Z
DTSTAMP:20260602T225836Z
UID:sss/12
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/12/">Val
 id hypothesis testing after hierarchical clustering</a>\nby Daniela Witten
  (University of Washington) as part of Stochastics and Statistics Seminar 
 Series\n\n\nAbstract\nAs datasets continue to grow in size\, in many setti
 ngs the focus of data collection has shifted away from testing pre-specifi
 ed hypotheses\, and towards hypothesis generation. Researchers are often i
 nterested in performing an exploratory data analysis in order to generate 
 hypotheses\, and then testing those hypotheses on the same data\; I will r
 efer to this as ‘double dipping’. Unfortunately\, double dipping can l
 ead to highly-inflated Type 1 errors. In this talk\, I will consider the s
 pecial case of hierarchical clustering. First\, I will show that sample–
 splitting does not solve the ‘double dipping’ problem for clustering. 
 Then\, I will propose a test for a difference in means between estimated c
 lusters that accounts for the cluster estimation process\, using a selecti
 ve inference framework. I will also show an application of this approach t
 o single-cell RNA-sequencing data. This is joint work with Lucy Gao (Unive
 rsity of Waterloo) and Jacob Bien (University of Southern California).\n
LOCATION:https://researchseminars.org/talk/sss/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Mary Wootters (Stanford University)
DTSTART:20201113T160500Z
DTEND:20201113T170500Z
DTSTAMP:20260602T225836Z
UID:sss/13
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/13/">Sha
 rp Thresholds for Random Subspaces\, and Applications</a>\nby Mary Wootter
 s (Stanford University) as part of Stochastics and Statistics Seminar Seri
 es\n\n\nAbstract\nAbstract: What combinatorial properties are likely to be
  satisfied by a random subspace over a finite field? For example\, is it l
 ikely that not too many points lie in any Hamming ball? What about any cub
 e?  We show that there is a sharp threshold on the dimension of the subspa
 ce at which the answers to these questions change from “extremely likely
 ” to “extremely unlikely\,” and moreover we give a simple characteri
 zation of this threshold for different properties. Our motivation comes fr
 om error correcting codes\, and we use this characterization to make progr
 ess on the questions of list-decoding and list-recovery for random linear 
 codes\, and also to establish the list-decodability of random Low Density 
 Parity-Check (LDPC) codes.\n\nThis talk is based on the joint works with V
 enkatesan Guruswami\, Ray Li\, Jonathan Mosheiff\, Nicolas Resch\, Noga Ro
 n-Zewi\, and Shashwat Silas.\nEvent Navigation\n
LOCATION:https://researchseminars.org/talk/sss/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Arnaud Doucet (University of Oxford)
DTSTART:20201120T160500Z
DTEND:20201120T170500Z
DTSTAMP:20260602T225836Z
UID:sss/14
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/14/">Per
 fect Simulation for Feynman-Kac Models using Ensemble Rejection Sampling</
 a>\nby Arnaud Doucet (University of Oxford) as part of Stochastics and Sta
 tistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rong Ge (Duke University)
DTSTART:20201204T160500Z
DTEND:20201204T170500Z
DTSTAMP:20260602T225836Z
UID:sss/15
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/15/">A L
 ocal Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Net
 </a>\nby Rong Ge (Duke University) as part of Stochastics and Statistics S
 eminar Series\n\n\nAbstract\nThe training of neural networks optimizes com
 plex non-convex objective functions\, yet in practice simple algorithms ac
 hieve great performances. Recent works suggest that over-parametrization c
 ould be a key ingredient in explaining this discrepancy. However\,  curren
 t theories could not fully explain the role of over-parameterization. In p
 articular\, they either work in a regime where neurons don't move much\, o
 r require large number of neurons. In this paper we develop a local conver
 gence theory for mildly over-parameterized two-layer neural net. We show t
 hat as long as the loss is already lower than a threshold (polynomial in r
 elevant parameters)\, all student neurons in an over-parametrized two-laye
 r neural network will converge to one of teacher neurons\, and the loss wi
 ll go to 0. Our result holds for any number of student neurons as long as 
 it's at least as large as the number of teacher neurons\, and gives explic
 it bounds on convergence rates that is independent of the number of studen
 t neurons. Based on joint work with Mo Zhou and Chi Jin.\n
LOCATION:https://researchseminars.org/talk/sss/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jerri Li (Microsoft Research)
DTSTART:20210219T160000Z
DTEND:20210219T171200Z
DTSTAMP:20260602T225836Z
UID:sss/16
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/16/">Fas
 ter and Simpler Algorithms for List Learning</a>\nby Jerri Li (Microsoft R
 esearch) as part of Stochastics and Statistics Seminar Series\n\n\nAbstrac
 t\nThe goal of list learning is to understand how to learn basic statistic
 s of a dataset when it has been corrupted by an overwhelming fraction of o
 utliers. More formally\, one is given a set of points $S$\, of which an $\
 \alpha$-fraction $T$ are promised to be well-behaved. The goal is then to 
 output an $O(1 / \\alpha)$ sized list of candidate means\, so that one of 
 these candidates is close to the true mean of the points in $T$. In many w
 ays\, list learning can be thought of as the natural robust generalization
  of clustering mixture models. This formulation of the problem was first p
 roposed in Charikar-Steinhardt-Valiant STOC’17\, which gave the first po
 lynomial-time algorithm which achieved nearly-optimal error guarantees. Mo
 re recently\, exciting work of Cherapanamjeri-Mohanty-Yau FOCS’20 gave a
 n algorithm which ran in time $\\widetilde{O} (n d \\mathrm{poly} (1 / \\a
 lpha))$. In particular\, this runtime is nearly linear in the input size f
 or $1/\\alpha = O(1)$\, however\, the runtime quickly becomes impractical 
 for reasonably small $1/\\alpha$. Moreover\, both of these algorithms are 
 quite complicated.\n\nIn our work\, we have two main contributions. First\
 , we give a polynomial time algorithm for this problem which achieves opti
 mal error\, which is considerably simpler than the previously known algori
 thms. Second\, we then build off of these insights to develop a more sophi
 sticated algorithm based on lazy mirror descent which runs in time $\\wide
 tilde{O}(n d / \\alpha + 1/\\alpha^6)$\, and which also achieves optimal e
 rror. Our algorithm improves upon the runtime of previous work for all $1/
 \\alpha = O(sqrt(d))$. The goal of this talk is to give a more or less sel
 f-contained proof of the first\, and then explain at a high level how to u
 se these ideas to develop our faster algorithm.\n\nJoint work with Ilias D
 iakonikolas\, Daniel Kane\, Daniel Kongsgaard\, and Kevin Tian\n
LOCATION:https://researchseminars.org/talk/sss/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yury Polyanskiy (MIT)
DTSTART:20210226T160000Z
DTEND:20210226T171200Z
DTSTAMP:20260602T225836Z
UID:sss/17
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/17/">Sel
 f-regularizing Property of Nonparametric Maximum Likelihood Estimator in M
 ixture Models</a>\nby Yury Polyanskiy (MIT) as part of Stochastics and Sta
 tistics Seminar Series\n\n\nAbstract\nIntroduced by Kiefer and Wolfowitz 1
 956\, the nonparametric maximum likelihood estimator (NPMLE) is a widely u
 sed methodology for learning mixture models and empirical Bayes estimation
 . Sidestepping the non-convexity in mixture likelihood\, the NPMLE estimat
 es the mixing distribution by maximizing the total likelihood over the spa
 ce of probability measures\, which can be viewed as an extreme form of ove
 r parameterization.\n\nIn this work we discover a surprising property of t
 he NPMLE solution. Consider\, for example\, a Gaussian mixture model on th
 e real line with a subgaussian mixing distribution. Leveraging complex-ana
 lytic techniques\, we show that with high probability the NPMLE based on a
  sample of size n has O(\\log n) atoms (mass points)\, significantly impro
 ving the deterministic upper bound of n due to Lindsay (1983). Notably\, a
 ny such Gaussian mixture is statistically indistinguishable from a finite 
 one with O(\\log n) components (and this is tight for certain mixtures). T
 hus\, absent any explicit form of model selection\, NPMLE automatically ch
 ooses the right model complexity\, a property we term self-regularization.
  Extensions to other exponential families are given. As a statistical appl
 ication\, we show that this structural property can be harnessed to bootst
 rap existing Hellinger risk bound of the (parametric) MLE for finite Gauss
 ian mixtures to the NPMLE for general Gaussian mixtures\, recovering a res
 ult of Zhang (2009). Time permitting\, we will discuss connections to appr
 oaching the optimal regret in empirical Bayes. This is based on joint work
  with Yihong Wu (Yale).\n
LOCATION:https://researchseminars.org/talk/sss/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Bhaswar B. Bhattacharya (bhaswar@wharton.upenn.edu)
DTSTART:20210305T160000Z
DTEND:20210305T171200Z
DTSTAMP:20260602T225836Z
UID:sss/18
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/18/">Det
 ection Thresholds for Distribution-Free Non-Parametric Tests: The Curious 
 Case of Dimension 8</a>\nby Bhaswar B. Bhattacharya (bhaswar@wharton.upenn
 .edu) as part of Stochastics and Statistics Seminar Series\n\nAbstract: TB
 A\n
LOCATION:https://researchseminars.org/talk/sss/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:James Robins (Harvard)
DTSTART:20210312T160000Z
DTEND:20210312T171200Z
DTSTAMP:20260602T225836Z
UID:sss/19
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/19/">On 
 nearly assumption-free tests of nominal confidence interval coverage for c
 ausal parameters estimated by machine learning</a>\nby James Robins (Harva
 rd) as part of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\
 n
LOCATION:https://researchseminars.org/talk/sss/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Daniel Roy (University of Toronto)
DTSTART:20210319T150000Z
DTEND:20210319T161200Z
DTSTAMP:20260602T225836Z
UID:sss/20
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/20/">Rel
 axing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-En
 tropic Regularization</a>\nby Daniel Roy (University of Toronto) as part o
 f Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Vladimir Vovk (Royal Holloway\, University of London)
DTSTART:20210326T150000Z
DTEND:20210326T161200Z
DTSTAMP:20260602T225836Z
UID:sss/21
DESCRIPTION:by Vladimir Vovk (Royal Holloway\, University of London) as pa
 rt of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Thibaut Le Gouic (MIT)
DTSTART:20210402T150000Z
DTEND:20210402T161200Z
DTSTAMP:20260602T225836Z
UID:sss/22
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/22/">Sam
 pler for the Wasserstein barycenter</a>\nby Thibaut Le Gouic (MIT) as part
  of Stochastics and Statistics Seminar Series\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Suriya Gunasekar (Microsoft Research)
DTSTART:20210409T150000Z
DTEND:20210409T161200Z
DTSTAMP:20260602T225836Z
UID:sss/23
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/23/">Fun
 ctions space view of linear multi-channel convolution networks with bounde
 d weight norm</a>\nby Suriya Gunasekar (Microsoft Research) as part of Sto
 chastics and Statistics Seminar Series\n\n\nAbstract\nThe magnitude of the
  weights of a neural network is a fundamental measure of complexity that p
 lays a crucial role in the study of implicit and explicit regularization. 
 For example\, in recent work\, gradient descent updates in overparameteriz
 ed models asymptotically lead to solutions that implicitly minimize the el
 l_2 norm of the parameters of the model\, resulting in an inductive bias t
 hat is highly architecture dependent. To investigate the properties of lea
 rned functions\, it is natural to consider a function space view given by 
 the minimum ell_2 norm of weights required to realize a given function wit
 h a given network. We call this the “induced regularizer” of the netwo
 rk. Building on a line of recent work\, we study the induced regularizer o
 f linear convolutional neural networks with a focus on the role of kernel 
 size and the number of channels. We introduce an SDP relaxation of the ind
 uced regularizer\, that we show is tight for networks with a single input 
 channel. Using this SDP formulation\, we show that the induced regularizer
  is independent of the number of the output channels for single-input chan
 nel networks\, and for multi-input channel networks\, we show independence
  given sufficiently many output channels. Moreover\, we show that as the k
 ernel size increases\, the induced regularizer interpolates between a basi
 s-invariant norm and a basis-dependent norm that promotes sparse structure
 s in Fourier space.\n\nBased on joint work with Meena Jagadeesan and Ilya 
 Razenshteyn.\n
LOCATION:https://researchseminars.org/talk/sss/23/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Laber (Duke University)
DTSTART:20210416T150000Z
DTEND:20210416T161200Z
DTSTAMP:20260602T225836Z
UID:sss/24
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/24/">Sam
 ple size considerations in precision medicine</a>\nby Eric Laber (Duke Uni
 versity) as part of Stochastics and Statistics Seminar Series\n\nAbstract:
  TBA\n
LOCATION:https://researchseminars.org/talk/sss/24/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Hilary Finucane (Broad Institute)
DTSTART:20210423T150000Z
DTEND:20210423T161200Z
DTSTAMP:20260602T225836Z
UID:sss/25
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/25/">Pri
 oritizing genes from genome-wide association studies</a>\nby Hilary Finuca
 ne (Broad Institute) as part of Stochastics and Statistics Seminar Series\
 n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/sss/25/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ann Lee (Carnegie Mellon University)
DTSTART:20210514T150000Z
DTEND:20210514T161200Z
DTSTAMP:20260602T225836Z
UID:sss/26
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/sss/26/">Lik
 elihood-Free Frequentist Inference</a>\nby Ann Lee (Carnegie Mellon Univer
 sity) as part of Stochastics and Statistics Seminar Series\n\nAbstract: TB
 A\n
LOCATION:https://researchseminars.org/talk/sss/26/
END:VEVENT
END:VCALENDAR