BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Himanshu Tyagi (IISc Bangalore)
DTSTART;VALUE=DATE-TIME:20201120T180000Z
DTEND;VALUE=DATE-TIME:20201120T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/1
DESCRIPTION:Title: General lower bounds for estimation under information const
raints\nby Himanshu Tyagi (IISc Bangalore) as part of Foundations of D
ata Science\n\n\nAbstract\nWe present very general lower bounds for parame
tric estimation when only limited information per sample is allowed. These
limitations can arise\, for example\, in form of communication constraint
s\, privacy constraints\, or linear measurements. Our lower bounds hold fo
r discrete distributions with large alphabet as well as continuous distrib
utions with high-dimensional parameters\, apply for any information constr
aint\, and are valid for any $\\ell_p$ loss function. Our bounds recover b
oth strong data processing inequality based bounds and Cramér-Rao based b
ound as special cases.\n\nThis talk is based on joint work with Jayadev Ac
harya and Clément Canonne.\n\nRegister to attend at https://docs.google.c
om/forms/d/e/1FAIpQLScoh-JZpgqveACNvFepKEtBObo6WEqUrjWVNSs_yt5EnO2pHw/view
form\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Adam Smith (Boston University)
DTSTART;VALUE=DATE-TIME:20201204T180000Z
DTEND;VALUE=DATE-TIME:20201204T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/2
DESCRIPTION:Title: When is Memorization of Irrelevant Training Data Necessary
for High-Accuracy Learning?\nby Adam Smith (Boston University) as part
of Foundations of Data Science\n\n\nAbstract\nModern machine learning mod
els are complex\, and frequently encode surprising amounts of information
about individual inputs. In extreme cases\, complex models appear to memor
ize entire input examples\, including seemingly irrelevant information (so
cial security numbers from text\, for example). In this paper\, we aim to
understand whether this sort of memorization is necessary for accurate lea
rning. We describe natural prediction problems in which every sufficiently
accurate training algorithm must encode\, in the prediction model\, essen
tially all the information about a large subset of its training examples.
This remains true even when the examples are high-dimensional and have ent
ropy much higher than the sample size\, and even when most of that informa
tion is ultimately irrelevant to the task at hand. Further\, our results d
o not depend on the training algorithm or the class of models used for lea
rning.\n\nOur problems are simple and fairly natural variants of the next-
symbol prediction and the cluster labeling tasks. These tasks can be seen
as abstractions of image- and text-related prediction problems. To establi
sh our results\, we reduce from a family of one-way communication problems
for which we prove new information complexity lower bounds.\n\nJoint work
with Gavin Brown\, Mark Bun\, Vitaly Feldman\, and Kunal Talwar.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Tim Roughgarden (Columbia University)
DTSTART;VALUE=DATE-TIME:20210318T180000Z
DTEND;VALUE=DATE-TIME:20210318T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/3
DESCRIPTION:Title: Data-Driven Algorithm Design\nby Tim Roughgarden (Colum
bia University) as part of Foundations of Data Science\n\n\nAbstract\nThe
best algorithm for a computational problem generally depends on the “rel
evant inputs”\, a concept that depends on the application domain and oft
en defies formal articulation. While there is a large literature on empiri
cal approaches to selecting the best algorithm for a given application dom
ain\, there has been surprisingly little theoretical analysis of the probl
em.\n\n\nWe adapt concepts from statistical and online learning theory to
reason about application-specific algorithm selection. Our models are stra
ightforward to understand\, but also expressive enough to capture several
existing approaches in the theoretical computer science and AI communities
\, ranging from self-improving algorithms to empirical performance models.
We present one framework that models algorithm selection as a statistical
learning problem\, and our work here shows that dimension notions from st
atistical learning theory\, historically used to measure the complexity of
classes of binary- and real-valued functions\, are relevant in a much bro
ader algorithmic context. We also study the online version of the algorith
m selection problem\, and give possibility and impossibility results for t
he existence of no-regret learning algorithms.\n\n\nJoint work with Rishi
Gupta.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/3/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ingrid Daubechies (Duke University)
DTSTART;VALUE=DATE-TIME:20210401T170000Z
DTEND;VALUE=DATE-TIME:20210401T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/4
DESCRIPTION:Title: Discovering low-dimensional manifolds in high-dimensional d
ata sets\nby Ingrid Daubechies (Duke University) as part of Foundation
s of Data Science\n\n\nAbstract\nThis talk reviews diffusion methods to id
entify low-dimensional manifolds underlying high-dimensional datasets\, an
d illustrates that by pinpointing additional mathematical structure\, impr
oved results can be obtained. Much of the talk draws on a case study from
a collaboration with biological morphologists\, who compare different phen
otypical structures to study relationships of living or extinct animals wi
th their surroundings and each other. This is typically done from carefull
y defined anatomical correspondence points (landmarks) on e.g. bones\; suc
h landmarking draws on highly specialized knowledge. To make possible more
extensive use of large (and growing) databases\, algorithms are required
for automatic morphological correspondence maps\, without any preliminary
marking of special features or landmarks by the user.\n\nMore information
on https://sites.google.com/view/dstheory/home (abstract and speaker biogr
aphy).\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Hamed Hassani (University of Pennsylvania)
DTSTART;VALUE=DATE-TIME:20210506T180000Z
DTEND;VALUE=DATE-TIME:20210506T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/5
DESCRIPTION:Title: Learning Robust Models: How does the Geometry of Perturbati
ons Play a Role?\nby Hamed Hassani (University of Pennsylvania) as par
t of Foundations of Data Science\n\n\nAbstract\nIn this talk\, we will foc
us on the emerging field of (adversarially) robust machine learning. The t
alk will be self-contained and no particular background on robust learning
will be needed. Recent progress in this field has been accelerated by the
observation that despite unprecedented performance on clean data\, modern
learning models remain fragile to seemingly innocuous changes such as sma
ll\, norm-bounded additive perturbations. Moreover\, recent work in this f
ield has looked beyond norm-bounded perturbations and has revealed that va
rious other types of distributional shifts in the data can significantly d
egrade performance. However\, in general our understanding of such shifts
is in its infancy and several key questions remain unaddressed.\n\n\nThe g
oal of this talk is to explain why robust learning paradigms have to be de
signed — and sometimes rethought — based on the geometry of the input
perturbations. We will cover a wide range of perturbation geometries from
simple norm-bounded perturbations\, to sparse\, natural\, and more general
distribution shifts. As we will show\, the geometry of the perturbations
necessitates fundamental modifications to the learning procedure as well a
s the architecture in order to ensure robustness. In the first part of the
talk\, we will discuss our recent theoretical results on robust learning
with respect to various geometries\, along with fundamental tradeoffs betw
een robustness and accuracy\, phase transitions\, etc. The remaining porti
on of the talk will be about developing practical robust training algorith
ms and evaluating the resulting (robust) deep networks against state-of-th
e-art methods on naturally-varying\, real-world datasets.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Christos Papadimitrious (Columbia University)
DTSTART;VALUE=DATE-TIME:20210923T170000Z
DTEND;VALUE=DATE-TIME:20210923T180000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/6
DESCRIPTION:Title: How does the brain beget the mind?\nby Christos Papadim
itrious (Columbia University) as part of Foundations of Data Science\n\n\n
Abstract\nHow do molecules\, cells and synapses effect reasoning\, intelli
gence\, planning\, language? Despite dazzling progress in experimental n
euroscience\, as well as in cognitive science at the other extreme of scal
e\, we do not seem to be making progress in the overarching question: the
gap is huge and a completely new approach seems to be required. As Richar
d Axel recently put it: "We don't have a logic for the transformation of
neural activity into thought [...]." \n\nWhat kind of formal system would
qualify as this "logic"? \n\nI will introduce the Assembly Calculus (AC)\
, a computational system which appears to be a promising bridge between ne
urons and cognition. Through this programming framework\, a Parser was re
cently implemented which (a) can handle reasonably complex sentences of En
glish and other languages\, and (b) works exclusively through the spiking
of neurons.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Maxim Raginsky (University of Illinois at Urbana-Champaign)
DTSTART;VALUE=DATE-TIME:20211021T170000Z
DTEND;VALUE=DATE-TIME:20211021T180000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/7
DESCRIPTION:by Maxim Raginsky (University of Illinois at Urbana-Champaign)
as part of Foundations of Data Science\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nicole Immorlica (Microsoft Research New England)
DTSTART;VALUE=DATE-TIME:20211118T180000Z
DTEND;VALUE=DATE-TIME:20211118T190000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/8
DESCRIPTION:Title: Communicating with Anecdotes\nby Nicole Immorlica (Micr
osoft Research New England) as part of Foundations of Data Science\n\n\nAb
stract\nClassic models of communication in economics typically assume agen
ts can communicate any message. However\, many important communications\,
such as those in newspapers or politicians' speeches\, use data to convey
information. In this talk\, we explore how the reliance on data impacts co
mmunication. In our model\, there are two Bayesian agents (a sender and a
receiver) who wish to communicate. The receiver must take an action whose
payoff depends on their personal preferences and an unknown state of the w
orld. The sender has access to a collection of data points correlated with
the state of the world and can send exactly one of these to the receiver
in order to influence her choice of action. Importantly\, the sender's per
sonal preferences may differ from the receiver's\, which affects the sende
r's strategic choice of what to send. We show that in a Nash equilibrium e
ven a small difference in preferences can lead to a significant bias in th
e communicated datum. This can significantly reduce informativeness of the
communication\, leading to substantial utility loss for both sides. One i
mplication is informational homophily: a receiver can rationally prefer to
obtain data from a poorly-informed sender with aligned preferences\, rath
er than a knowledgeable expert whose preferences may differ from her own.\
n\n\nJoint work with Nika Haghtalab\, Brendan Lucier\, Markus Mobius and D
ivya Mohan.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Eric Tchetgen Tchetgen (The Wharton School\, University of Pennsyl
vania)
DTSTART;VALUE=DATE-TIME:20220316T200000Z
DTEND;VALUE=DATE-TIME:20220316T210000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/9
DESCRIPTION:Title: An Introduction to Proximal Causal Learning\nby Eric Tc
hetgen Tchetgen (The Wharton School\, University of Pennsylvania) as part
of Foundations of Data Science\n\n\nAbstract\nA standard assumption for ca
usal inference from observational data is that one has measured a sufficie
ntly rich set of covariates to ensure that within covariates strata\, subj
ects are exchangeable across observed treatment values. Skepticism about t
he exchangeability assumption in observational studies is often warranted
because it hinges on one’s ability to accurately measure covariates capt
uring all potential sources of confounding. Realistically\, confounding me
chanisms can rarely if ever\, be learned with certainty from measured cova
riates. One can therefore only ever hope that covariate measurements are a
t best proxies of true underlying confounding mechanisms operating in an o
bservational study\, thus invalidating causal claims made on basis of stan
dard exchangeability conditions.\n\nCausal learning from proxies is a chal
lenging inverse problem which has to date remained unresolved. In this pap
er\, we introduce a formal potential outcome framework for proximal causal
learning\, which while explicitly acknowledging covariate measurements as
imperfect proxies of confounding mechanisms\, offers an opportunity to le
arn about causal effects in settings where exchangeability based on measur
ed covariates fails. Sufficient conditions for nonparametric identificatio
n are given\, leading to the proximal g-formula and corresponding proximal
g-computation algorithm for estimation\, both generalizations of Robins
’ foundational g-formula and g-computation algorithm\, which account exp
licitly for bias due to unmeasured confounding. Both point treatment and t
ime-varying treatment settings are considered\, and an application of prox
imal g-computation of causal effects is given for illustration.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Gabriel Peyré (CNRS and Ecole Normale Supérieure)
DTSTART;VALUE=DATE-TIME:20220420T190000Z
DTEND;VALUE=DATE-TIME:20220420T200000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/10
DESCRIPTION:Title: Scaling Optimal Transport for High dimensional Learning\nby Gabriel Peyré (CNRS and Ecole Normale Supérieure) as part of Found
ations of Data Science\n\n\nAbstract\nOptimal transport (OT) has recently
gained lot of interest in machine learning. It is a natural tool to compar
e in a geometrically faithful way probability distributions. It finds appl
ications in both supervised learning (using geometric loss functions) and
unsupervised learning (to perform generative model fitting). OT is however
plagued by the curse of dimensionality\, since it might require a number
of samples which grows exponentially with the dimension. In this talk\, I
will explain how to leverage entropic regularization methods to define com
putationally efficient loss functions\, approximating OT with a better sam
ple complexity. More information and references can be found on the websit
e of our book "Computational Optimal Transport" https://optimaltransport.g
ithub.io/\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sébastien Bubeck (Microsoft Research)
DTSTART;VALUE=DATE-TIME:20220511T200000Z
DTEND;VALUE=DATE-TIME:20220511T210000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/11
DESCRIPTION:Title: Set Chasing\, with an application to online shortest path<
/a>\nby Sébastien Bubeck (Microsoft Research) as part of Foundations of D
ata Science\n\n\nAbstract\nSince the late 19th century\, mathematicians ha
ve realized the importance and generality of selection problems: given a c
ollection of sets\, select an element in each set\, possibly in a "nice" w
ay. Of particular importance in computer science is the scenario where the
ground set is a metric space\, in which case it is natural to ask for *Li
pschitz* selection. In this talk I will describe a far-reaching extension
of this classical Lipschitz selection problem to an *online* setting\, whe
re sets are streaming to the selector. I will show how Riemannian gradient
descent (aka mirror descent) can be used to approach this type of problem
s. I will illustrate the power of the framework by solving a long-standing
problem in online shortest path known as layered graph traversal (introdu
ced by Papadimitriou and Yannakakis in 1989).\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Shuchi Chawla (UT Austin)
DTSTART;VALUE=DATE-TIME:20220406T200000Z
DTEND;VALUE=DATE-TIME:20220406T210000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/12
DESCRIPTION:Title: Pandora's Box with Correlations: Learning and Approximatio
n\nby Shuchi Chawla (UT Austin) as part of Foundations of Data Science
\n\n\nAbstract\nIn the Pandora's Box problem\, the algorithm is provided w
ith a number of boxes with unknown (stochastic) rewards contained inside t
hem. The algorithm can open any box at some cost\, discover the reward ins
ide\, and based on these observations can choose one box and keep the rewa
rd contained in it. Given the distributions from which the rewards are dra
wn\, the algorithm must determine an order in which to open the boxes as w
ell as when to stop and accept the best reward found so far. In general\,
an optimal algorithm may make both decisions adaptively based on instantia
tions observed previously. The Pandora's Box problem and its extensions ca
pture many kinds of optimization problems with stochastic input where the
algorithm can obtain instantiations of input random variables at some cost
. Previous work on these problems assumes that different random variables
in the input are distributed independently. As such it does not capture ma
ny real-world settings. In this work\, we provide the first algorithms for
Pandora's Box-type problems with correlations. In the independent setting
\, optimal algorithms are non-adaptive and based on the notion of the Gitt
ins index. These techniques fail to extend to the correlated case. We assu
me that the algorithm has access to samples drawn from the joint distribut
ion on input and provide solutions that require few samples\; are computat
ionally efficient\; and guarantee approximate optimality. This is joint wo
rk with Evangelia Gergatsouli\, Yifeng Teng\, Christos Tzamos\, and Ruimin
Zhang and appeared in FOCS'20.\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/12/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Ravi Kumar (Google)
DTSTART;VALUE=DATE-TIME:20220615T200000Z
DTEND;VALUE=DATE-TIME:20220615T210000Z
DTSTAMP;VALUE=DATE-TIME:20230208T061737Z
UID:DataScienceFoundations/13
DESCRIPTION:Title: (Date to be confirmed)\nby Ravi Kumar (Google) as part
of Foundations of Data Science\n\nAbstract: TBA\n
LOCATION:https://researchseminars.org/talk/DataScienceFoundations/13/
END:VEVENT
END:VCALENDAR