BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Maxim Raginsky (University of Illinois Urbana-Champaign)
DTSTART:20200519T160000Z
DTEND:20200519T173000Z
DTSTAMP:20260422T212556Z
UID:IASML/1
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/1/">Ne
 ural SDEs: deep generative models in the diffusion limit</a>\nby Maxim Rag
 insky (University of Illinois Urbana-Champaign) as part of IAS Seminar Ser
 ies on Theoretical Machine Learning\n\n\nAbstract\nIn deep generative mode
 ls\, the latent variable is generated by a time-inhomogeneous Markov chain
 \, where at each time step we pass the current state through a parametric 
 nonlinear map\, such as a feedforward neural net\, and add a small indepen
 dent Gaussian perturbation. In this talk\, based on joint work with Belind
 a Tzen\, I will discuss the diffusion limit of such models\, where we incr
 ease the number of layers while sending the step size and the noise varian
 ce to zero. I will first provide a unified viewpoint on both sampling and 
 variational inference in such generative models through the lens of stocha
 stic control. Then I will show how we can quantify the expressiveness of d
 iffusion-based generative models. Specifically\, I will prove that one can
  efficiently sample from a wide class of terminal target distributions by 
 choosing the drift of the latent diffusion from the class of multilayer fe
 edforward neural nets\, with the accuracy of sampling measured by the Kull
 back-Leibler divergence to the target distribution. Finally\, I will brief
 ly discuss a scheme for unbiased\, finite-variance simulation in such mode
 ls. This scheme can be implemented as a deep generative model with a rando
 m number of layers.\n
LOCATION:https://researchseminars.org/talk/IASML/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Roni Rosenfeld (Carnegie Mellon University)
DTSTART:20200521T190000Z
DTEND:20200521T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/2
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/2/">Fo
 recasting epidemics and pandemics</a>\nby Roni Rosenfeld (Carnegie Mellon 
 University) as part of IAS Seminar Series on Theoretical Machine Learning\
 n\n\nAbstract\nEpidemiological forecasting is critically needed for decisi
 on making by national and local governments\, public health officials\, he
 althcare institutions and the general public. The Delphi group at Carnegie
  Mellon University was founded in 2012 to advance the theory and technolog
 ical capability of epidemiological forecasting\, and to promote its role i
 n decision making\, both public and private. Our long term vision is to ma
 ke epidemiological forecasting as useful and universally accepted as weath
 er forecasting is today. I will describe some of the methods we developed 
 over the past eight year for forecasting flu\, dengue and other epidemics\
 , and the challenges we faced in adapting these method to the COVID pandem
 ic in the past few months.\n
LOCATION:https://researchseminars.org/talk/IASML/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Aleksander Madry (MIT)
DTSTART:20200609T162000Z
DTEND:20200609T175000Z
DTSTAMP:20260422T212556Z
UID:IASML/4
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/4/">Wh
 at do our models learn?</a>\nby Aleksander Madry (MIT) as part of IAS Semi
 nar Series on Theoretical Machine Learning\n\n\nAbstract\nLarge-scale visi
 on benchmarks have driven---and often even defined---progress in machine l
 earning. However\, these benchmarks are merely proxies for the real-world 
 tasks we actually care about. How well do our benchmarks capture such task
 s?\n\nIn this talk\, I will discuss the alignment between our benchmark-dr
 iven ML paradigm and the real-world uses cases that motivate it. First\, w
 e will explore examples of biases in the ImageNet dataset\, and how state-
 of-the-art models exploit them. We will then demonstrate how these biases 
 arise as a result of design choices in the data collection and curation pr
 ocesses.\n\nBased on joint works with Logan Engstrom\, Andrew Ilyas\, Shib
 ani Santurkar\, Jacob Steinhardt\, Dimitris Tsipras and Kai Xiao.\n
LOCATION:https://researchseminars.org/talk/IASML/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Michael I. Jordan (UC Berkeley)
DTSTART:20200611T190000Z
DTEND:20200611T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/5
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/5/">On
  Langevin Dynamics in Machine Learning</a>\nby Michael I. Jordan (UC Berke
 ley) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\nAb
 stract\nLangevin diffusions are continuous-time stochastic processes that 
 are based on the gradient of a potential function. As such they have many 
 connections---some known and many still to be explored---to gradient-based
  machine learning. I'll discuss several recent results in this vein: (1) t
 he use of Langevin-based algorithms in bandit problems\; (2) the accelerat
 ion of Langevin diffusions\; (3) how to use Langevin Monte Carlo without m
 aking smoothness assumptions. I'll present these results in the context of
  a general argument about the virtues of continuous-time perspectives in t
 he analysis of discrete-time optimization and Monte Carlo algorithms.\n
LOCATION:https://researchseminars.org/talk/IASML/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Avrim Blum (Toyota Technological Institute at Chicago)
DTSTART:20200616T190000Z
DTEND:20200616T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/6
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/6/">On
  learning in the presence of biased data and strategic behavior</a>\nby Av
 rim Blum (Toyota Technological Institute at Chicago) as part of IAS Semina
 r Series on Theoretical Machine Learning\n\n\nAbstract\nIn this talk I wil
 l discuss two lines of work involving learning in the presence of biased d
 ata and strategic behavior.  In the first\, we ask whether fairness constr
 aints on learning algorithms can actually improve the accuracy of the clas
 sifier produced\, when training data is unrepresentative or corrupted due 
 to bias.  Typically\, fairness constraints are analyzed as a tradeoff with
  classical objectives such as accuracy.  Our results here show there are n
 atural scenarios where they can be a win-win\, helping to improve overall 
 accuracy.  In the second line of work we consider strategic classification
 : settings where the entities being measured and classified wish to be cla
 ssified as positive (e.g.\, college admissions) and will try to modify the
 ir observable features if possible to make that happen.  We consider this 
 in the online setting where a particular challenge is that updates made by
  the learning algorithm will change how the inputs behave as well.\n
LOCATION:https://researchseminars.org/talk/IASML/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Csaba Szepesvári (University of Alberta)
DTSTART:20200618T190000Z
DTEND:20200618T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/7
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/7/">Th
 e challenges of model-based reinforcement learning and how to overcome the
 m</a>\nby Csaba Szepesvári (University of Alberta) as part of IAS Seminar
  Series on Theoretical Machine Learning\n\n\nAbstract\nSome believe that t
 ruly effective and efficient reinforcement learning algorithms must explic
 itly construct and explicitly reason with models that capture the causal s
 tructure of the world. In short\, model-based reinforcement learning is no
 t optional. As this is not a new belief\, it may be surprising that empiri
 cally\, at least as far as the current state of art is concerned\, the maj
 ority of the top performing algorithms are model-free. In this talk\, I wi
 ll define three major challenges that need to be overcome for model-based 
 methods to take their place above\, or before the model-free ones: (1) pla
 nning with large models\; (2) models are never well-specified\; (3) models
  need to focus on task relevant aspects and ignore others. For each of the
  challenges\, I will describe recent results that address them and I will 
 also take a tally of the most interesting (and challenging) remaining open
  problems.\n
LOCATION:https://researchseminars.org/talk/IASML/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sanjeev Arora (Princeton University and IAS)
DTSTART:20200625T190000Z
DTEND:20200625T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/8
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/8/">In
 stance-Hiding Schemes for Private Distributed Learning</a>\nby Sanjeev Aro
 ra (Princeton University and IAS) as part of IAS Seminar Series on Theoret
 ical Machine Learning\n\n\nAbstract\nAn important problem today is how to 
 allow multiple distributed entities to train a shared neural network on th
 eir private data while protecting data privacy. Federated learning is a st
 andard framework for distributed deep learning Federated Learning\, and on
 e would like to assure full privacy in that framework . The proposed metho
 ds\, such as homomorphic encryption and differential privacy\, come with d
 rawbacks such as large computational overhead or large drop in accuracy. T
 his work introduces a new and simple encryption of training data\, which h
 ides the information in it and allows its use in the usual deep learning p
 ipeline. The encryption is inspired by classic notion of instance-hiding i
 n cryptography. Experiments show that it allows training with fairly small
  effect on final accuracy.\n\nWe also give some theoretical analysis of pr
 ivacy guarantees for this encryption\, showing that violating privacy requ
 ires attackers to solve a difficult computational problem.\n\nJoint work w
 ith Yangsibo Huang\, Zhao Song\, and Kai Li. To appear at ICML 2020.\n
LOCATION:https://researchseminars.org/talk/IASML/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jennifer Listgarten (UC Berkeley)
DTSTART:20200707T163000Z
DTEND:20200707T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/9
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/9/">Ma
 chine learning-based design (of proteins\, small molecules and beyond)</a>
 \nby Jennifer Listgarten (UC Berkeley) as part of IAS Seminar Series on Th
 eoretical Machine Learning\n\n\nAbstract\nData-driven design is making hea
 dway into a number of application areas\, including protein\, small-molecu
 le\, and materials engineering. The design goal is to construct an object 
 with desired properties\, such as a protein that binds to a target more ti
 ghtly than previously observed. To that end\, costly experimental measurem
 ents are being replaced with calls to a high-capacity regression model tra
 ined on labeled data\, which can be leveraged in an in silico search for p
 romising design candidates. The aim then is to discover designs that are b
 etter than the best design in the observed data. This goal puts machine-le
 arning based design in a much more difficult spot than traditional applica
 tions of predictive modelling\, since successful design requires\, by defi
 nition\, some degree of extrapolation---a pushing of the predictive models
  to its unknown limits\, in parts of the design space that are a priori un
 known. In this talk\, I will anchor this overall problem in protein engine
 ering\, and discuss our emerging approaches to tackle it.\n
LOCATION:https://researchseminars.org/talk/IASML/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Anima Anandkumar (Caltech)
DTSTART:20200709T190000Z
DTEND:20200709T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/10
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/10/">R
 ole of Interaction in Competitive Optimization</a>\nby Anima Anandkumar (C
 altech) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\
 nAbstract\nCompetitive optimization is needed for many ML problems such as
  training GANs\, robust reinforcement learning\, and adversarial learning.
  Standard approaches to competitive optimization involve each agent indepe
 ndently optimizing their objective functions using SGD or other gradient-b
 ased approaches. However\, they suffer from oscillations and instability\,
  since the optimization does not account for interaction among the players
 . We introduce competitive gradient descent (CGD) that explicitly incorpor
 ates interaction by solving for Nash equilibrium of a local game. We exten
 d CGD to competitive mirror descent (CMD) for solving conically constraine
 d competitive problems by using the dual geometry induced by a Bregman div
 ergence.\n\nWe demonstrate the effectiveness of our approach for training 
 GANs and solving constrained reinforcement learning (RL) problems. We also
  derive a competitive policy optimization method to train RL agents in com
 petitive games. Finally\, we provide a novel perspective on training GANs 
 by pointing out the "GAN-dilemma" a fundamental flaw of the divergence-min
 imization perspective on GANs. Instead\, we argue that an implicit competi
 tive regularization due to simultaneous training methods\, such as CGD\, i
 s a crucial mechanism behind GAN performance.\n
LOCATION:https://researchseminars.org/talk/IASML/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Max Welling (University of Amsterdam)
DTSTART:20200721T163000Z
DTEND:20200721T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/11
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/11/">G
 raph Nets: The Next Generation</a>\nby Max Welling (University of Amsterda
 m) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\nAbst
 ract\nIn this talk I will introduce our next generation of graph neural ne
 tworks. GNNs have the property that they are invariant to permutations of 
 the nodes in the graph and to rotations of the graph as a whole. We claim 
 this is unnecessarily restrictive and in this talk we will explore extensi
 ons of these GNNs to more flexible equivariant constructions. In particula
 r\, Natural Graph Networks for general graphs are globally equivariant und
 er permutations of the nodes but can still be executed through local messa
 ge passing protocols. Our mesh-CNNs on manifolds are equivariant under SO(
 2) gauge transformations and as such\, unlike regular GNNs\, entertain non
 -isotropic kernels. And finally our SE(3)-transformers are local message p
 assing GNNs\, invariant to permutations but equivariant to global SE(3) tr
 ansformations. These developments clearly emphasize the importance of geom
 etry and symmetries as design principles for graph (or other) neural netwo
 rks.\n\nJoint with: Pim de Haan and Taco Cohen (Natural Graph Networks) Pi
 m de Haan\, Maurice Weiler and Taco Cohen (Mesh-CNNs) Fabian Fuchs and Dan
 iel Worrall (SE(3)-Transformers)\n
LOCATION:https://researchseminars.org/talk/IASML/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Yoshua Bengio (Université de Montréal)
DTSTART:20200723T190000Z
DTEND:20200723T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/12
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/12/">P
 riors for Semantic Variables</a>\nby Yoshua Bengio (Université de Montré
 al) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\nAbs
 tract\nSome of the aspects of the world around us are captured in natural 
 language and refer to semantic high-level variables\, which often have a c
 ausal role (referring to agents\, objects\, and actions or intentions). Th
 ese high-level variables also seem to satisfy very peculiar characteristic
 s which low-level data (like images or sounds) do not share\, and it would
  be good to clarify these characteristics in the form of priors which can 
 guide the design of machine learning systems benefitting from these assump
 tions. Since these priors are not just about the joint distribution betwee
 n the semantic variables (e.g. it has a sparse factor graph corresponding 
 to a modular decomposition of knowledge) but also about how the distributi
 on changes (typically by causal interventions)\, this analysis may also he
 lp to build machine learning systems which can generalize better out-of-di
 stribution. Introducing such assumptions is necessary to even start having
  a theory about generalizing out-of-distribution. There are also fascinati
 ng connections between these priors and what is hypothesized about conscio
 us processing in the brain\, with conscious processing allowing us to reas
 on (i.e.\, perform chains of inferences about the past and the future\, as
  well as credit assignment) at the level of these high-level variables. Th
 is involves attention mechanisms and short-term memory to form a bottlenec
 k of information being broadcast around the brain between different parts 
 of it\, as we focus on different high-level variables and some of their in
 teractions. The presentation summarizes a few recent results using some of
  these ideas for discovering causal structure and modularizing recurrent n
 eural networks with attention mechanisms in order to obtain better out-of-
 distribution generalization and move deep learning towards capturing some 
 of the functions associated with conscious processing over high-level sema
 ntic variables.\n
LOCATION:https://researchseminars.org/talk/IASML/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jeffrey Negrea (University of Toronto)
DTSTART:20200714T163000Z
DTEND:20200714T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/13
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/13/">R
 elaxing the I.I.D. assumption: Adaptive mnimax optimal sequential predicti
 on with expert advice</a>\nby Jeffrey Negrea (University of Toronto) as pa
 rt of IAS Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nWe
  consider sequential prediction with expert advice when the data are gener
 ated stochastically\, but the distributions generating the data may vary a
 rbitrarily among some constraint set. We quantify relaxations of the class
 ical I.I.D. assumption in terms of possible constraint sets\, with I.I.D. 
 at one extreme\, and an adversarial mechanism at the other. The Hedge algo
 rithm\, long known to be minimax optimal for in the adversarial regime\, h
 as recently been shown to also be minimax optimal in the I.I.D. setting. W
 e show that Hedge is sub-optimal between these extremes\, and present a ne
 w algorithm that is adaptively minimax optimal with respect to our relaxat
 ions of the I.I.D. assumption\, without knowledge of which setting prevail
 s.\n
LOCATION:https://researchseminars.org/talk/IASML/13/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Arthur Gretton (University College London)
DTSTART:20200728T163000Z
DTEND:20200728T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/14
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/14/">G
 eneralized Energy-Based Models</a>\nby Arthur Gretton (University College 
 London) as part of IAS Seminar Series on Theoretical Machine Learning\n\n\
 nAbstract\nI will introduce Generalized Energy Based Models (GEBM) for gen
 erative modelling. These models combine two trained components: a base dis
 tribution (generally an implicit model)\, which can learn the support of d
 ata with low intrinsic dimension in a high dimensional space\; and an ener
 gy function\, to refine the probability mass on the learned support. Both 
 the energy function and base jointly constitute the final model\, unlike G
 ANs\, which retain only the base distribution (the "generator"). In partic
 ular\, while the energy function is analogous to the GAN critic function\,
  it is not discarded after training.\nGEBMs are trained by alternating bet
 ween learning the energy and the base. Both training stages are well-defin
 ed: the energy is learned by maximising a generalized likelihood\, and the
  resulting energy-based loss provides informative gradients for learning t
 he base. Samples from the posterior on the latent space of the trained mod
 el can be obtained via MCMC\, thus finding regions in this space that prod
 uce better quality samples. Empirically\, the GEBM samples on image-genera
 tion tasks are of much better quality than those from the learned generato
 r alone\, indicating that all else being equal\, the GEBM will outperform 
 a GAN of the same complexity. GEBMs also return state-of-the-art performan
 ce on density modelling tasks\, and when using base measures with an expli
 cit form.\n
LOCATION:https://researchseminars.org/talk/IASML/14/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Peter Stone (University of Texas at Austin)
DTSTART:20200730T190000Z
DTEND:20200730T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/15
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/15/">E
 fficient Robot Skill Learning via Grounded Simulation Learning\, Imitation
  Learning from Observation\, and Off-Policy Reinforcement Learning</a>\nby
  Peter Stone (University of Texas at Austin) as part of IAS Seminar Series
  on Theoretical Machine Learning\n\n\nAbstract\nFor autonomous robots to o
 perate in the open\, dynamically changing world\, they will need to be abl
 e to learn a robust set of skills from relatively little experience. This 
 talk begins by introducing Grounded Simulation Learning as a way to bridge
  the so-called reality gap between simulators and the real world in order 
 to enable transfer learning from simulation to a real robot. It then intro
 duces two new algorithms for imitation learning from observation that enab
 le a robot to mimic demonstrated skills from state-only trajectories\, wit
 hout any knowledge of the actions selected by the demonstrator. Connection
 s to theoretical advances in off-policy reinforcement learning will be hig
 hlighted throughout.\n\nGrounded Simulation Learning has led to the fastes
 t known stable walk on a widely used humanoid robot\, and imitation learni
 ng from observation opens the possibility of robots learning from the vast
  trove of videos available online.\n
LOCATION:https://researchseminars.org/talk/IASML/15/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Aapo Hyvärinen (University of Helsinki)
DTSTART:20200804T163000Z
DTEND:20200804T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/16
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/16/">N
 onlinear independent component analysis</a>\nby Aapo Hyvärinen (Universit
 y of Helsinki) as part of IAS Seminar Series on Theoretical Machine Learni
 ng\n\n\nAbstract\nUnsupervised learning\, in particular learning general n
 onlinear representations\, is one of the deepest problems in machine learn
 ing. Estimating latent quantities in a generative model provides a princip
 led framework\, and has been successfully used in the linear case\, e.g. w
 ith independent component analysis (ICA) and sparse coding. However\, exte
 nding ICA to the nonlinear case has proven to be extremely difficult: A st
 raight-forward extension is unidentifiable\, i.e. it is not possible to re
 cover those latent components that actually generated the data. Here\, we 
 show that this problem can be solved by using additional information eithe
 r in the form of temporal structure or an additional observed variable. We
  start by formulating two generative models in which the data is an arbitr
 ary but invertible nonlinear transformation of time series (components) wh
 ich are statistically independent of each other. Drawing from the theory o
 f linear ICA\, we formulate two distinct classes of temporal structure of 
 the components which enable identification\, i.e. recovery of the original
  independent components. We further generalize the framework to the case w
 here instead of temporal structure\, an additional "auxiliary" variable is
  observed and used by means of conditioning (e.g. audio in addition to vid
 eo). Our methods are closely related to "self-supervised" methods heuristi
 cally proposed in computer vision\, and also provide a theoretical foundat
 ion for such methods in terms of estimating a latent-variable model. Likew
 ise\, we show how variants of deep latent-variable models such as VAE's ca
 n be seen as nonlinear ICA\, and made identifiable by suitable conditionin
 g.\n
LOCATION:https://researchseminars.org/talk/IASML/16/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Xing (Carnegie Mellon University)
DTSTART:20200806T190000Z
DTEND:20200806T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/17
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/17/">A
  Blueprint of Standardized and Composable Machine Learning</a>\nby Eric Xi
 ng (Carnegie Mellon University) as part of IAS Seminar Series on Theoretic
 al Machine Learning\n\n\nAbstract\nIn handling wide range of experiences r
 anging from data instances\, knowledge\, constraints\, to rewards\, advers
 aries\, and lifelong interplay in an ever-growing spectrum of tasks\, cont
 emporary ML/AI research has resulted in thousands of models\, learning par
 adigms\, optimization algorithms\, not mentioning countless approximation 
 heuristics\, tuning tricks\, and black-box oracles\, plus combinations of 
 all above. While pushing the field forward rapidly\, these results also ma
 ke a comprehensive grasp of existing ML techniques more and more difficult
 \, and make standardized\, reusable\, repeatable\, reliable\, and explaina
 ble practice and further development of ML/AI products quite costly\, if p
 ossible\, at all. In this talk\, we present a simple and systematic bluepr
 int of ML\, from the aspects of losses\, optimization solvers\, and model 
 architectures\, that provides a unified mathematical formulation for learn
 ing with all experiences and tasks. The blueprint offers a holistic unders
 tanding of the diverse ML algorithms\, guidance of operationalizing ML for
  creating problem solutions in a composable and mechanic manner\, and unif
 ied framework for theoretical analysis.\n
LOCATION:https://researchseminars.org/talk/IASML/17/
END:VEVENT
BEGIN:VEVENT
SUMMARY:John Shawe-Taylor (University College London)
DTSTART:20200811T163000Z
DTEND:20200811T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/18
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/18/">S
 tatistical Learning Theory for Modern Machine Learning</a>\nby John Shawe-
 Taylor (University College London) as part of IAS Seminar Series on Theore
 tical Machine Learning\n\n\nAbstract\nProbably Approximately Correct (PAC)
  learning has attempted to analyse the generalisation of learning systems 
 within the statistical learning framework. It has been referred to as a 
 ‘worst case’ analysis\, but the tools have been extended to analyse ca
 ses where benign distributions mean we can still generalise even if worst 
 case bounds suggest we cannot. The talk will cover the PAC-Bayes approach 
 to analysing generalisation that is inspired by Bayesian inference\, but l
 eads to a different role for the prior and posterior distributions. We wil
 l discuss its application to Support Vector Machines and Deep Neural Netwo
 rks\, including the use of distribution defined priors.\n
LOCATION:https://researchseminars.org/talk/IASML/18/
END:VEVENT
BEGIN:VEVENT
SUMMARY:John Langford (Microsoft Research)
DTSTART:20200813T190000Z
DTEND:20200813T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/19
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/19/">L
 atent State Discovery in Reinforcement Learning</a>\nby John Langford (Mic
 rosoft Research) as part of IAS Seminar Series on Theoretical Machine Lear
 ning\n\n\nAbstract\nThere are three core orthogonal problems in reinforcem
 ent learning: (1) Crediting actions (2) generalizing across rich observati
 ons (3) Exploring to discover the information necessary for learning.  Goo
 d solutions to pairs of these problems are fairly well known at this point
 \, but solutions for all three are just now being discovered.   I’ll dis
 cuss several such results and dive into details on a few of them.\n
LOCATION:https://researchseminars.org/talk/IASML/19/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Li Deng (Citadel)
DTSTART:20200818T163000Z
DTEND:20200818T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/20
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/20/">F
 rom Speech AI to Finance AI and Back</a>\nby Li Deng (Citadel) as part of 
 IAS Seminar Series on Theoretical Machine Learning\n\n\nAbstract\nA brief 
 review will be provided first on how deep learning has disrupted speech re
 cognition and language processing industries since 2009. Then connections 
 will be drawn between the techniques (deep learning or otherwise) for mode
 ling speech and language and those for financial markets. Similarities and
  differences of these two fields will be explored. In particular\, three u
 nique technical challenges to financial investment are addressed: extremel
 y low signal-to-noise ratio\, extremely strong nonstationarity (with adver
 sarial nature)\, and heterogeneous big data. Finally\, how the potential s
 olutions to these challenges can come back to benefit and further advance 
 speech recognition and language processing technology will be discussed.\n
LOCATION:https://researchseminars.org/talk/IASML/20/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jason Eisner (Johns Hopkins University)
DTSTART:20200820T190000Z
DTEND:20200820T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/21
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/21/">E
 vent Sequence Modeling with the Neural Hawkes Process</a>\nby Jason Eisner
  (Johns Hopkins University) as part of IAS Seminar Series on Theoretical M
 achine Learning\n\n\nAbstract\nSuppose you are monitoring discrete events 
 in real time.  Can you predict what events will happen in the future\, and
  when?  Can you fill in past events that you may have missed?  A probabili
 ty model that supports such reasoning is the neural Hawkes process (NHP)\,
  in which the Poisson intensities of K event types at time t depend on the
  history of past events.  This autoregressive architecture can capture com
 plex dependencies.  It resembles an LSTM language model over K word types\
 , but allows the LSTM state to evolve in continuous time.  \n\nThis talk w
 ill present the NHP model along with methods for estimating parameters (ML
 E and NCE)\, sampling predictions of the future (thinning)\, and imputing 
 missing events (particle smoothing).  I'll then show how to scale the NHP 
 or the LSTM language model to large K\, beginning with a temporal deductiv
 e database for a real-world domain\, which can track how possible event ty
 pes and other facts change over time.  We take the system state to be a co
 llection of vector-space embeddings of these facts\, and derive a deep rec
 urrent architecture from the temporal Datalog program that specifies the d
 atabase.  We call this method "neural Datalog through time."\n\nThis work 
 was done with Hongyuan Mei and other collaborators including Guanghui Qin\
 , Minjie Xu\, and Tom Wan.\n
LOCATION:https://researchseminars.org/talk/IASML/21/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Piotr Indyk (Massachusetts Institute of Technology)
DTSTART:20200825T163000Z
DTEND:20200825T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/22
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/22/">L
 earning-Based Sketching Algorithms</a>\nby Piotr Indyk (Massachusetts Inst
 itute of Technology) as part of IAS Seminar Series on Theoretical Machine 
 Learning\n\n\nAbstract\nClassical algorithms typically provide "one size f
 its all" performance\, and do not leverage properties or patterns in their
  inputs. A recent line of work aims to address this issue by developing al
 gorithms that use machine learning predictions to improve their performanc
 e. In this talk I will present two examples of this type\, in the context 
 of streaming and sketching algorithms. In particular\, I will show how to 
 use machine learning predictions to improve the performance of (a) low-mem
 ory streaming algorithms for frequency estimation\, and (b) generating spa
 ce partitions for nearest neighbor search.\n\nThe talk will cover material
  from papers co-authored with Y Dong\, CY Hsu\, D Katabi\, I Razenshteyn\,
  T Wagner and A Vakilian.\n
LOCATION:https://researchseminars.org/talk/IASML/22/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Inderjit Dhillon (University of Texas at Austin)
DTSTART:20200827T190000Z
DTEND:20200827T203000Z
DTSTAMP:20260422T212556Z
UID:IASML/23
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/23/">M
 ulti-Output Prediction: Theory and Practice</a>\nby Inderjit Dhillon (Univ
 ersity of Texas at Austin) as part of IAS Seminar Series on Theoretical Ma
 chine Learning\n\n\nAbstract\nMany challenging problems in modern applicat
 ions amount to finding relevant results from an enormous output space of p
 otential candidates\, for example\, finding the best matching product from
  a large catalog or suggesting related search phrases on a search engine. 
 The size of the output space for these problems can be in the millions to 
 billions. Moreover\, observational or training data is often limited for m
 any of the so-called “long-tail” of items in the output space. Given t
 he inherent paucity of training data for most of the items in the output s
 pace\, developing machine learned models that perform well for spaces of t
 his size is challenging. Fortunately\, items in the output space are often
  correlated thereby presenting an opportunity to alleviate the data sparsi
 ty issue. In this talk\, I will first discuss the challenges in modern mul
 ti-output prediction\, including missing values\, features associated with
  outputs\, absence of explicit negative examples\, and the need to scale u
 p to enormous data sets. Bilinear methods\, such as Inductive Matrix Compl
 etion (IMC)\, enable us to handle missing values and output features in pr
 actice\, while coming with theoretical guarantees. Nonlinear methods such 
 as nonlinear IMC and DSSM (Deep Semantic Similarity Model) enable more pow
 erful models that are used in practice in real-life applications. However\
 , inference in these models scales linearly with the size of the output sp
 ace. In order to scale up\, I will present the Prediction for Enormous and
  Correlated Output Spaces (PECOS) framework\, that performs prediction in 
 three phases: (i) in the first phase\, the output space is organized using
  a semantic indexing scheme\, (ii) in the second phase\, the indexing is u
 sed to narrow down the output space by orders of magnitude using a machine
  learned matching scheme\, and (iii) in the third phase\, the matched item
 s are ranked by a final ranking scheme. The versatility and modularity of 
 PECOS allows for easy plug-and-play of various choices for the indexing\, 
 matching\, and ranking phases\, and it is possible to ensemble various mod
 els\, each arising from a particular choice for the three phases.\n
LOCATION:https://researchseminars.org/talk/IASML/23/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Soheil Feizi (University of Maryland College Park)
DTSTART:20200623T163000Z
DTEND:20200623T174500Z
DTSTAMP:20260422T212556Z
UID:IASML/24
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/IASML/24/">G
 eneralizable Adversarial Robustness to Unforeseen Attacks</a>\nby Soheil F
 eizi (University of Maryland College Park) as part of IAS Seminar Series o
 n Theoretical Machine Learning\n\n\nAbstract\nIn the last couple of years\
 , a lot of progress has been made to enhance robustness of models against 
 adversarial attacks. However\, two major shortcomings still remain: (i) pr
 actical defenses are often vulnerable against strong “adaptive” attack
  algorithms\, and (ii) current defenses have poor generalization to “unf
 oreseen” attack threat models (the ones not used in training).\n\nIn thi
 s talk\, I will present our recent results to tackle these issues. I will 
 first discuss generalizability of a class of provable defenses based on ra
 ndomized smoothing to various Lp and non-Lp attack models. Then\, I will p
 resent adversarial attacks and defenses for a novel “perceptual” adver
 sarial threat model. Remarkably\, the defense against perceptual threat mo
 del generalizes well against many types of unforeseen Lp and non-Lp advers
 arial attacks.\n\nThis talk is based on joint works with Alex Levine\, Sah
 il Singla\, Cassidy Laidlaw\, Aounon Kumar and Tom Goldstein.\n
LOCATION:https://researchseminars.org/talk/IASML/24/
END:VEVENT
END:VCALENDAR
