BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Himanshu Tyagi (IISc Bangalore)
DTSTART:20201120T180000Z
DTEND:20201120T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/1
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/1/">General lower bounds for estimation under information const
 raints</a>\nby Himanshu Tyagi (IISc Bangalore) as part of Foundations of D
 ata Science\n\n\nAbstract\nWe present very general lower bounds for parame
 tric estimation when only limited information per sample is allowed. These
  limitations can arise\, for example\, in form of communication constraint
 s\, privacy constraints\, or linear measurements. Our lower bounds hold fo
 r discrete distributions with large alphabet as well as continuous distrib
 utions with high-dimensional parameters\, apply for any information constr
 aint\, and are valid for any $\\ell_p$ loss function. Our bounds recover b
 oth strong data processing inequality based bounds and Cramér-Rao based b
 ound as special cases.\n\nThis talk is based on joint work with Jayadev Ac
 harya and Clément Canonne.\n\nRegister to attend at https://docs.google.c
 om/forms/d/e/1FAIpQLScoh-JZpgqveACNvFepKEtBObo6WEqUrjWVNSs_yt5EnO2pHw/view
 form\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Adam Smith (Boston University)
DTSTART:20201204T180000Z
DTEND:20201204T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/2
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/2/">When is Memorization of Irrelevant Training Data Necessary 
 for High-Accuracy Learning?</a>\nby Adam Smith (Boston University) as part
  of Foundations of Data Science\n\n\nAbstract\nModern machine learning mod
 els are complex\, and frequently encode surprising amounts of information 
 about individual inputs. In extreme cases\, complex models appear to memor
 ize entire input examples\, including seemingly irrelevant information (so
 cial security numbers from text\, for example). In this paper\, we aim to 
 understand whether this sort of memorization is necessary for accurate lea
 rning. We describe natural prediction problems in which every sufficiently
  accurate training algorithm must encode\, in the prediction model\, essen
 tially all the information about a large subset of its training examples. 
 This remains true even when the examples are high-dimensional and have ent
 ropy much higher than the sample size\, and even when most of that informa
 tion is ultimately irrelevant to the task at hand. Further\, our results d
 o not depend on the training algorithm or the class of models used for lea
 rning.\n\nOur problems are simple and fairly natural variants of the next-
 symbol prediction and the cluster labeling tasks. These tasks can be seen 
 as abstractions of image- and text-related prediction problems. To establi
 sh our results\, we reduce from a family of one-way communication problems
  for which we prove new information complexity lower bounds.\n\nJoint work
  with Gavin Brown\, Mark Bun\, Vitaly Feldman\, and Kunal Talwar.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Tim Roughgarden (Columbia University)
DTSTART:20210318T180000Z
DTEND:20210318T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/3
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/3/">Data-Driven Algorithm Design</a>\nby Tim Roughgarden (Colum
 bia University) as part of Foundations of Data Science\n\n\nAbstract\nThe 
 best algorithm for a computational problem generally depends on the “rel
 evant inputs”\, a concept that depends on the application domain and oft
 en defies formal articulation. While there is a large literature on empiri
 cal approaches to selecting the best algorithm for a given application dom
 ain\, there has been surprisingly little theoretical analysis of the probl
 em.\n\n\nWe adapt concepts from statistical and online learning theory to 
 reason about application-specific algorithm selection. Our models are stra
 ightforward to understand\, but also expressive enough to capture several 
 existing approaches in the theoretical computer science and AI communities
 \, ranging from self-improving algorithms to empirical performance models.
  We present one framework that models algorithm selection as a statistical
  learning problem\, and our work here shows that dimension notions from st
 atistical learning theory\, historically used to measure the complexity of
  classes of binary- and real-valued functions\, are relevant in a much bro
 ader algorithmic context. We also study the online version of the algorith
 m selection problem\, and give possibility and impossibility results for t
 he existence of no-regret learning algorithms.\n\n\nJoint work with Rishi 
 Gupta.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/3/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ingrid Daubechies (Duke University)
DTSTART:20210401T170000Z
DTEND:20210401T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/4
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/4/">Discovering low-dimensional manifolds in high-dimensional d
 ata sets</a>\nby Ingrid Daubechies (Duke University) as part of Foundation
 s of Data Science\n\n\nAbstract\nThis talk reviews diffusion methods to id
 entify low-dimensional manifolds underlying high-dimensional datasets\, an
 d illustrates that by pinpointing additional mathematical structure\, impr
 oved results can be obtained. Much of the talk draws on a case study from 
 a collaboration with biological morphologists\, who compare different phen
 otypical structures to study relationships of living or extinct animals wi
 th their surroundings and each other. This is typically done from carefull
 y defined anatomical correspondence points (landmarks) on e.g. bones\; suc
 h landmarking draws on highly specialized knowledge. To make possible more
  extensive use of large (and growing) databases\, algorithms are required 
 for automatic morphological correspondence maps\, without any preliminary 
 marking of special features or landmarks by the user.\n\nMore information 
 on https://sites.google.com/view/dstheory/home (abstract and speaker biogr
 aphy).\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Hamed Hassani (University of Pennsylvania)
DTSTART:20210506T180000Z
DTEND:20210506T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/5
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/5/">Learning Robust Models: How does the Geometry of Perturbati
 ons Play a Role?</a>\nby Hamed Hassani (University of Pennsylvania) as par
 t of Foundations of Data Science\n\n\nAbstract\nIn this talk\, we will foc
 us on the emerging field of (adversarially) robust machine learning. The t
 alk will be self-contained and no particular background on robust learning
  will be needed. Recent progress in this field has been accelerated by the
  observation that despite unprecedented performance on clean data\, modern
  learning models remain fragile to seemingly innocuous changes such as sma
 ll\, norm-bounded additive perturbations. Moreover\, recent work in this f
 ield has looked beyond norm-bounded perturbations and has revealed that va
 rious other types of distributional shifts in the data can significantly d
 egrade performance. However\, in general our understanding of such shifts 
 is in its infancy and several key questions remain unaddressed.\n\n\nThe g
 oal of this talk is to explain why robust learning paradigms have to be de
 signed — and sometimes rethought — based on the geometry of the input 
 perturbations. We will cover a wide range of perturbation geometries from 
 simple norm-bounded perturbations\, to sparse\, natural\, and more general
  distribution shifts. As we will show\, the geometry of the perturbations 
 necessitates fundamental modifications to the learning procedure as well a
 s the architecture in order to ensure robustness. In the first part of the
  talk\, we will discuss our recent theoretical results on robust learning 
 with respect to various geometries\, along with fundamental tradeoffs betw
 een robustness and accuracy\, phase transitions\, etc. The remaining porti
 on of the talk will be about developing practical robust training algorith
 ms and evaluating the resulting (robust) deep networks against state-of-th
 e-art methods on naturally-varying\, real-world datasets.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Christos Papadimitrious (Columbia University)
DTSTART:20210923T170000Z
DTEND:20210923T180000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/6
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/6/">How does the brain beget the mind?</a>\nby Christos Papadim
 itrious (Columbia University) as part of Foundations of Data Science\n\n\n
 Abstract\nHow do molecules\, cells and synapses effect reasoning\, intelli
 gence\, planning\, language?   Despite dazzling progress in experimental n
 euroscience\, as well as in cognitive science at the other extreme of scal
 e\, we do not seem to be making progress in the overarching question: the 
 gap is huge and a completely new approach seems to be required.  As Richar
 d Axel recently put it:  "We don't have a logic for the transformation of 
 neural activity into thought [...]."  \n\nWhat kind of formal system would
  qualify as this "logic"? \n\nI will introduce the Assembly Calculus (AC)\
 , a computational system which appears to be a promising bridge between ne
 urons and cognition.  Through this programming framework\, a Parser was re
 cently implemented which (a) can handle reasonably complex sentences of En
 glish and other languages\, and (b) works exclusively through the spiking 
 of neurons.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Maxim Raginsky (University of Illinois at Urbana-Champaign)
DTSTART:20211021T170000Z
DTEND:20211021T180000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/7
DESCRIPTION:by Maxim Raginsky (University of Illinois at Urbana-Champaign)
  as part of Foundations of Data Science\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nicole Immorlica (Microsoft Research New England)
DTSTART:20211118T180000Z
DTEND:20211118T190000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/8
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/8/">Communicating with Anecdotes</a>\nby Nicole Immorlica (Micr
 osoft Research New England) as part of Foundations of Data Science\n\n\nAb
 stract\nClassic models of communication in economics typically assume agen
 ts can communicate any message. However\, many important communications\, 
 such as those in newspapers or politicians' speeches\, use data to convey 
 information. In this talk\, we explore how the reliance on data impacts co
 mmunication. In our model\, there are two Bayesian agents (a sender and a 
 receiver) who wish to communicate. The receiver must take an action whose 
 payoff depends on their personal preferences and an unknown state of the w
 orld. The sender has access to a collection of data points correlated with
  the state of the world and can send exactly one of these to the receiver 
 in order to influence her choice of action. Importantly\, the sender's per
 sonal preferences may differ from the receiver's\, which affects the sende
 r's strategic choice of what to send. We show that in a Nash equilibrium e
 ven a small difference in preferences can lead to a significant bias in th
 e communicated datum. This can significantly reduce informativeness of the
  communication\, leading to substantial utility loss for both sides. One i
 mplication is informational homophily: a receiver can rationally prefer to
  obtain data from a poorly-informed sender with aligned preferences\, rath
 er than a knowledgeable expert whose preferences may differ from her own.\
 n\n\nJoint work with Nika Haghtalab\, Brendan Lucier\, Markus Mobius and D
 ivya Mohan.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Tchetgen Tchetgen (The Wharton School\, University of Pennsyl
 vania)
DTSTART:20220316T200000Z
DTEND:20220316T210000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/9
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/9/">An Introduction to Proximal Causal Learning</a>\nby Eric Tc
 hetgen Tchetgen (The Wharton School\, University of Pennsylvania) as part 
 of Foundations of Data Science\n\n\nAbstract\nA standard assumption for ca
 usal inference from observational data is that one has measured a sufficie
 ntly rich set of covariates to ensure that within covariates strata\, subj
 ects are exchangeable across observed treatment values. Skepticism about t
 he exchangeability assumption in observational studies is often warranted 
 because it hinges on one’s ability to accurately measure covariates capt
 uring all potential sources of confounding. Realistically\, confounding me
 chanisms can rarely if ever\, be learned with certainty from measured cova
 riates. One can therefore only ever hope that covariate measurements are a
 t best proxies of true underlying confounding mechanisms operating in an o
 bservational study\, thus invalidating causal claims made on basis of stan
 dard exchangeability conditions.\n\nCausal learning from proxies is a chal
 lenging inverse problem which has to date remained unresolved. In this pap
 er\, we introduce a formal potential outcome framework for proximal causal
  learning\, which while explicitly acknowledging covariate measurements as
  imperfect proxies of confounding mechanisms\, offers an opportunity to le
 arn about causal effects in settings where exchangeability based on measur
 ed covariates fails. Sufficient conditions for nonparametric identificatio
 n are given\, leading to the proximal g-formula and corresponding proximal
  g-computation algorithm for estimation\, both generalizations of Robins
 ’ foundational g-formula and g-computation algorithm\, which account exp
 licitly for bias due to unmeasured confounding. Both point treatment and t
 ime-varying treatment settings are considered\, and an application of prox
 imal g-computation of causal effects is given for illustration.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gabriel Peyré (CNRS and Ecole Normale Supérieure)
DTSTART:20220420T190000Z
DTEND:20220420T200000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/10
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/10/">Scaling Optimal Transport for High dimensional Learning</a
 >\nby Gabriel Peyré (CNRS and Ecole Normale Supérieure) as part of Found
 ations of Data Science\n\n\nAbstract\nOptimal transport (OT) has recently 
 gained lot of interest in machine learning. It is a natural tool to compar
 e in a geometrically faithful way probability distributions. It finds appl
 ications in both supervised learning (using geometric loss functions) and 
 unsupervised learning (to perform generative model fitting). OT is however
  plagued by the curse of dimensionality\, since it might require a number 
 of samples which grows exponentially with the dimension. In this talk\, I 
 will explain how to leverage entropic regularization methods to define com
 putationally efficient loss functions\, approximating OT with a better sam
 ple complexity. More information and references can be found on the websit
 e of our book "Computational Optimal Transport" https://optimaltransport.g
 ithub.io/\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sébastien Bubeck (Microsoft Research)
DTSTART:20220511T200000Z
DTEND:20220511T210000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/11
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/11/">Set Chasing\, with an application to online shortest path<
 /a>\nby Sébastien Bubeck (Microsoft Research) as part of Foundations of D
 ata Science\n\n\nAbstract\nSince the late 19th century\, mathematicians ha
 ve realized the importance and generality of selection problems: given a c
 ollection of sets\, select an element in each set\, possibly in a "nice" w
 ay. Of particular importance in computer science is the scenario where the
  ground set is a metric space\, in which case it is natural to ask for *Li
 pschitz* selection. In this talk I will describe a far-reaching extension 
 of this classical Lipschitz selection problem to an *online* setting\, whe
 re sets are streaming to the selector. I will show how Riemannian gradient
  descent (aka mirror descent) can be used to approach this type of problem
 s. I will illustrate the power of the framework by solving a long-standing
  problem in online shortest path known as layered graph traversal (introdu
 ced by Papadimitriou and Yannakakis in 1989).\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Shuchi Chawla (UT Austin)
DTSTART:20220406T200000Z
DTEND:20220406T210000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/12
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/12/">Pandora's Box with Correlations: Learning and Approximatio
 n</a>\nby Shuchi Chawla (UT Austin) as part of Foundations of Data Science
 \n\n\nAbstract\nIn the Pandora's Box problem\, the algorithm is provided w
 ith a number of boxes with unknown (stochastic) rewards contained inside t
 hem. The algorithm can open any box at some cost\, discover the reward ins
 ide\, and based on these observations can choose one box and keep the rewa
 rd contained in it. Given the distributions from which the rewards are dra
 wn\, the algorithm must determine an order in which to open the boxes as w
 ell as when to stop and accept the best reward found so far. In general\, 
 an optimal algorithm may make both decisions adaptively based on instantia
 tions observed previously. The Pandora's Box problem and its extensions ca
 pture many kinds of optimization problems with stochastic input where the 
 algorithm can obtain instantiations of input random variables at some cost
 . Previous work on these problems assumes that different random variables 
 in the input are distributed independently. As such it does not capture ma
 ny real-world settings. In this work\, we provide the first algorithms for
  Pandora's Box-type problems with correlations. In the independent setting
 \, optimal algorithms are non-adaptive and based on the notion of the Gitt
 ins index. These techniques fail to extend to the correlated case. We assu
 me that the algorithm has access to samples drawn from the joint distribut
 ion on input and provide solutions that require few samples\; are computat
 ionally efficient\; and guarantee approximate optimality. This is joint wo
 rk with Evangelia Gergatsouli\, Yifeng Teng\, Christos Tzamos\, and Ruimin
  Zhang and appeared in FOCS'20.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ravi Kumar (Google)
DTSTART:20220615T200000Z
DTEND:20220615T210000Z
DTSTAMP:20260422T225637Z
UID:DataScienceFoundations/13
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/DataScienceF
 oundations/13/">(Date to be confirmed)</a>\nby Ravi Kumar (Google) as part
  of Foundations of Data Science\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/13/
END:VEVENT
END:VCALENDAR
