BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Gabriel Peyré (CNRS\, Ecole Normale Supérieure)
DTSTART;VALUE=DATE-TIME:20200420T120000Z
DTEND;VALUE=DATE-TIME:20200420T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/1
DESCRIPTION:Title: S
caling Optimal Transport for High dimensional Learning\nby Gabriel Pey
ré (CNRS\, Ecole Normale Supérieure) as part of One World seminar: Mathe
matical Methods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nOptimal t
ransport (OT) has recently gained lot of interest in machine learning. It
is a natural tool to compare in a geometrically faithful way probability d
istributions. It finds applications in both supervised learning (using geo
metric loss functions) and unsupervised learning (to perform generative mo
del fitting). OT is however plagued by the curse of dimensionality\, since
it might require a number of samples which grows exponentially with the d
imension. In this talk\, I will review entropic regularization methods whi
ch define geometric loss functions approximating OT with a better sample c
omplexity. More information and references can be found on the website of
our book Computational Optimal Transport.\n
LOCATION:https://researchseminars.org/talk/OWMADS/1/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Marie-Therese Wolfram (Warwick University\, UK)
DTSTART;VALUE=DATE-TIME:20200420T130000Z
DTEND;VALUE=DATE-TIME:20200420T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/2
DESCRIPTION:Title: I
nverse Optimal Transport\nby Marie-Therese Wolfram (Warwick University
\, UK) as part of One World seminar: Mathematical Methods for Arbitrary Da
ta Sources (MADS)\n\n\nAbstract\nDiscrete optimal transportation problems
arise in various contexts in engineering\, the sciences and the social sci
ences. Examples include the marriage market in economics or international
migration flows in demographics. Often the underlying cost criterion is un
known\, or only partly known\, and the observed optimal solutions are corr
upted by noise. In this talk we discuss a systematic approach to infer unk
nown costs from noisy observations of optimal transportation plans. The pr
oposed methodologies are developed within the Bayesian framework for inver
se problems and require only the ability to solve the forward optimal tran
sport problem\, which is a linear program\, and to generate random numbers
. We illustrate our approach using the example of international migration
flows. Here reported migration flow data captures (noisily) the number of
individuals moving from one country to another in a given period of time.
It can be interpreted as a noisy observation of an optimal transportation
map\, with costs related to the geographical position of countries. We use
a graph-based formulation of the problem\, with countries at the nodes of
graphs and non-zero weighted adjacencies only on edges between countries
which share a border. We use the proposed algorithm to estimate the weight
s\, which represent cost of transition\, and to quantify uncertainty in th
ese weights.\n
LOCATION:https://researchseminars.org/talk/OWMADS/2/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lorenzo Rosasco (Universitá di Genova\, IT - MIT\, US)
DTSTART;VALUE=DATE-TIME:20200504T120000Z
DTEND;VALUE=DATE-TIME:20200504T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/3
DESCRIPTION:Title: E
fficient kernel-PCA by Nyström sampling\nby Lorenzo Rosasco (Universi
tá di Genova\, IT - MIT\, US) as part of One World seminar: Mathematical
Methods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nIn this talk\, we
discuss and study a Nyström based approach to efficient large scale kern
el principal component analysis (PCA). The latter is a natural nonlinear e
xtension of classical PCA based on considering a nonlinear feature map or
the corresponding kernel. Like other kernel approaches\, kernel PCA enjoys
good mathematical and statistical properties but\, numerically\, it scale
s poorly with the sample size. Our analysis shows that Nyström sampling g
reatly improves computational efficiency without incurring any loss of sta
tistical accuracy. While similar effects have been observed in supervised
learning\, this is the first such result for PCA. Our theoretical findings
\, which are also illustrated by numerical results\, are based on a combin
ation of analytic and concentration of measure techniques. Our study is mo
re broadly motivated by the question of understanding the interplay betwee
n statistical and computational requirements for learning.\n
LOCATION:https://researchseminars.org/talk/OWMADS/3/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lars Ruthotto (Emory University\, US)
DTSTART;VALUE=DATE-TIME:20200518T120000Z
DTEND;VALUE=DATE-TIME:20200518T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/4
DESCRIPTION:Title: M
achine learning meets optimal transport: old solutions for new problems an
d vice versa\nby Lars Ruthotto (Emory University\, US) as part of One
World seminar: Mathematical Methods for Arbitrary Data Sources (MADS)\n\n\
nAbstract\nThis talk presents new connections between optimal transport (O
T)\, which has been a critical problem in applied mathematics for centurie
s\, and machine learning (ML)\, which has been receiving enormous attentio
n in the past decades. In recent years\, OT and ML have become increasingl
y intertwined. This talk contributes to this booming intersection by provi
ding efficient and scalable computational methods for OT and ML.\nThe firs
t part of the talk shows how neural networks can be used to efficiently ap
proximate the optimal transport map between two densities in high dimensio
ns. To avoid the curse-of-dimensionality\, we combine Lagrangian and Euler
ian viewpoints and employ neural networks to solve the underlying Hamilton
-Jacobi-Bellman equation. Our approach avoids any space discretization and
can be implemented in existing machine learning frameworks. We present nu
merical results for OT in up to 100 dimensions and validate our solver in
a two-dimensional setting. \nThe second part of the talk shows how optimal
transport theory can improve the efficiency of training generative models
and density estimators\, which are critical in machine learning. We consi
der continuous normalizing flows (CNF) that have emerged as one of the mos
t promising approaches for variational inference in the ML community. Our
numerical implementation is a discretize-optimize method whose forward pro
blem relies on manually derived gradients and Laplacian of the neural netw
ork and uses automatic differentiation in the optimization. In common benc
hmark challenges\, our method outperforms state-of-the-art CNF approaches
by reducing the network size by 8x\, accelerate the training by 10x- 40x a
nd allow 30x-50x faster inference.\n
LOCATION:https://researchseminars.org/talk/OWMADS/4/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Vincent Duval (Inria\, FR)
DTSTART;VALUE=DATE-TIME:20200608T130000Z
DTEND;VALUE=DATE-TIME:20200608T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/5
DESCRIPTION:Title: R
epresenting the solutions of total variation regularized problems\nby
Vincent Duval (Inria\, FR) as part of One World seminar: Mathematical Meth
ods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nRepresenting the solu
tions of total variation regularized problems\n\nThe total (gradient) vari
ation is a regularizer which has been widely used in inverse problems aris
ing in image processing\, following the pioneering work of Rudin\, Osher a
nd Fatemi. In this talk\, I will describe the structure the solutions to t
he total variation regularized variational problems when one has a finite
number of measurements.\nFirst\, I will present a general representation p
rinciple for the solutions of convex problems\, then I will apply it to th
e total variation by describing the faces of its unit ball.\n\nIt is a joi
nt work with Claire Boyer\, Antonin Chambolle\, Yohann De Castro\, Frédé
ric de Gournay and Pierre Weiss.\n
LOCATION:https://researchseminars.org/talk/OWMADS/5/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Michael Unser (École polytechnique fédérale de Lausanne\, CH)
DTSTART;VALUE=DATE-TIME:20200608T120000Z
DTEND;VALUE=DATE-TIME:20200608T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/6
DESCRIPTION:Title: R
epresenter theorems for machine learning and inverse problems\nby Mich
ael Unser (École polytechnique fédérale de Lausanne\, CH) as part of On
e World seminar: Mathematical Methods for Arbitrary Data Sources (MADS)\n\
n\nAbstract\nRegularization addresses the ill-posedness of the training pr
oblem in machine learning or the reconstruction of a signal from a limited
number of measurements. The standard strategy consists in augmenting the
original cost functional by an energy that penalizes solutions with undesi
rable behaviour. In this presentation\, I will present a general represent
er theorem that characterizes the solutions of a remarkably broad class of
optimization problems in Banach spaces and helps us understand the effect
of regularization. I will then use the theorem to retrieve some classical
characterizations such as the celebrated representer theorem of machine l
eaning for RKHS\, Tikhonov regularization\, representer theorems for spars
ity promoting functionals\, as well as a few new ones\, including a result
for deep neural networks.\n
LOCATION:https://researchseminars.org/talk/OWMADS/6/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Nicolás García Trillos (University of Wisconsin-Madison\, US)
DTSTART;VALUE=DATE-TIME:20200615T130000Z
DTEND;VALUE=DATE-TIME:20200615T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/7
DESCRIPTION:Title: R
egularity theory and uniform convergence in the large data limit of graph
Laplacian eigenvectors on random data clouds.\nby Nicolás García Tri
llos (University of Wisconsin-Madison\, US) as part of One World seminar:
Mathematical Methods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nGrap
h Laplacians are omnipresent objects in machine learning that have been us
ed in supervised\, unsupervised and semi supervised settings due to their
versatility in extracting local and global geometric information from data
clouds. In this talk I will present an overview of how the mathematical t
heory built around them has gotten deeper and deeper\, layer by layer\, si
nce the appearance of the first results on pointwise consistency in the 20
00’s\, until the most recent developments\; this line of research has fo
und strong connections between PDEs built on proximity graphs on data clou
ds and PDEs on manifolds\, and has given a more precise mathematical meani
ng to the task of “manifold learning”. In the first part of the talk I
will highlight how ideas from optimal transport made some of the initial
steps\, which provided L2 type error estimates between the spectra of gra
ph Laplacians and Laplace-Beltrami operators\, possible. In the second par
t of the talk\, which is based on recent work with Jeff Calder and Marta L
ewicka\, I will present a newly developed regularity theory for graph Lapl
acians which among other things allow us to bootstrap the L2 error estimat
es developed through optimal transport and upgrade them to uniform converg
ence and almost C^{0\,1} convergence rates. The talk can be seen as a tale
of how a flow of ideas from optimal transport\, PDEs\, and in general\, a
nalysis\, has made possible a finer understanding of concrete objects popu
lar in data analysis and machine learning.\n
LOCATION:https://researchseminars.org/talk/OWMADS/7/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Michaël Fanuel (KU Leuven\, BE)
DTSTART;VALUE=DATE-TIME:20200504T130000Z
DTEND;VALUE=DATE-TIME:20200504T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/8
DESCRIPTION:Title: D
iversity sampling in kernel method\nby Michaël Fanuel (KU Leuven\, BE
) as part of One World seminar: Mathematical Methods for Arbitrary Data So
urces (MADS)\n\n\nAbstract\nA well-known technique for large scale kernel
methods is the Nyström approximation. Based on a subset of landmarks\, it
gives a low rank approximation of the kernel matrix\, and is known to pro
vide a form of implicit regularization. We will discuss the impact of samp
ling diverse landmarks for constructing the Nyström approximation in supe
rvised and unsupervised problems. In particular\, three methods will be co
nsidered: uniform sampling\, leverage score sampling and Determinantal Poi
nt Processes (DPP). The implicit regularization due the diversity of the l
andmarks will be made explicit by numerical simulations and analysed furth
er in the case of DPP sampling by some theoretical results.\n
LOCATION:https://researchseminars.org/talk/OWMADS/8/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Francis Bach (Inria\, FR)
DTSTART;VALUE=DATE-TIME:20200518T130000Z
DTEND;VALUE=DATE-TIME:20200518T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/9
DESCRIPTION:Title: O
n the convergence of gradient descent for wide two-layer neural networks\nby Francis Bach (Inria\, FR) as part of One World seminar: Mathematica
l Methods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nMany supervised
learning methods are naturally cast as optimization problems. For predict
ion models which are linear in their parameters\, this often leads to conv
ex problems for which many guarantees exist. Models which are non-linear i
n their parameters such as neural networks lead to non-convex optimization
problems for which guarantees are harder to obtain. In this talk\, I will
consider two-layer neural networks with homogeneous activation functions
where the number of hidden neurons tends to infinity\, and show how qualit
ative convergence guarantees may be derived. I will also highlight open pr
oblems related to the quantitative behavior of gradient descent for such m
odels. (Based on joint work with Lénaïc Chizat\, https://arxiv.org/abs/1
805.09545\, https://arxiv.org/abs/2002.04486)\n\nPlease note that this is
a joint talk with the One World Optimization Seminar.\n
LOCATION:https://researchseminars.org/talk/OWMADS/9/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Andrea Braides (University of Rome Tor Vergata)
DTSTART;VALUE=DATE-TIME:20200615T120000Z
DTEND;VALUE=DATE-TIME:20200615T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/10
DESCRIPTION:Title:
Continuum limits of interfacial energies on (sparse and) dense graphs\
nby Andrea Braides (University of Rome Tor Vergata) as part of One World s
eminar: Mathematical Methods for Arbitrary Data Sources (MADS)\n\n\nAbstra
ct\nI review some results on the convergence of energies defined on graphs
. My interest in such energies comes from models in Solid Mechanics (where
the bonds in the graph represent the relevant atomistic interactions) or
Statistical Physics (Ising systems)\, but the nodes of the graph can also
be thought as a collection of data on which the bonds describe some relati
on between the data.\nThe typical objective is an approximate (simplified)
continuum description of problems of minimal cut as the number N of the n
odes of the graphs diverges.\nIf the graphs are sparse (i.e. the number of
bonds is much less than the total number of pairs of nodes as N goes to i
nfinity)\, often (more precisely when we have some control on the range or
on the decay of the interactions) such minimal-cut problems translate int
o minimal-perimeter problems for sets or partitions on the continuum. This
description is easily understood for periodic lattice systems\, but carri
es on also for random distributions of nodes. In the case of a (locally) u
niform Poisson distribution\, actually the limit minimal-cut problems are
described by more regular energies than in the periodic-lattice case. \nWh
en we relax the hypothesis on the range of interactions\, the description
of the limit of sparse graphs becomes more complex\, as it depends subtly
on geometric characteristics of the graph\, and is partially understood. S
ome easy examples show that\, even though for the continuum limit we still
remain in a similar analytical environment\, the description as (sharp) i
nterfacial energies can be lost in this case\, and more “diffuse” inte
rfaces must be taken into account.\nIf instead we consider dense sequences
of graphs (i.e.\, the number of bonds is of the same order as the total n
umber of pairs as N goes to infinity) then a completely different limit en
vironment must be used\, that of graphons (which are abstract limits of gr
aphs)\, for which sophisticated combinatoric results can be used. We can r
e-read the existing notion of convergence of graphs to graphons as a conve
rgence of the related cut functionals to non-local energies on a simple re
ference parameter set. This convergence provides an approximate descriptio
n of the corresponding minimal-cup problems.\nWorks in collaboration with
Alicandro\, Cicalese\, Piatnitski and Solci (sparse graphs) and Cermelli a
nd Dovetta (dense graphs).\n
LOCATION:https://researchseminars.org/talk/OWMADS/10/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Jana de Wiljes (Universität Potsdam\, DE)
DTSTART;VALUE=DATE-TIME:20200629T120000Z
DTEND;VALUE=DATE-TIME:20200629T124500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/11
DESCRIPTION:Title:
Sequential learning for decision support under uncertainty\nby Jana de
Wiljes (Universität Potsdam\, DE) as part of One World seminar: Mathemat
ical Methods for Arbitrary Data Sources (MADS)\n\n\nAbstract\nIn many appl
icational areas there is a need to determine a control variable that optim
izes a pre-specified objective. This problem is particularly challenging w
hen knowledge on the underlying dynamics is subject to various sources of
uncertainty. A scenario such as that arises for instance in the context
of therapy individualization to improve the efficacy and safety of medical
treatment. Mathematical models describing the pharmacokinetics and pharma
codynamics of a drug together with data on associated biomarkers can be le
veraged to support decision-making by predicting therapy outcomes. We pres
ent a continuous learning strategy which follows a novel sequential Monte
Carlo tree search approach and explore how the underlying uncertainties re
flect in the approximated control variable.\n
LOCATION:https://researchseminars.org/talk/OWMADS/11/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Björn Sprungk (TU Freiberg\, DE)
DTSTART;VALUE=DATE-TIME:20200629T130000Z
DTEND;VALUE=DATE-TIME:20200629T134500Z
DTSTAMP;VALUE=DATE-TIME:20240423T103205Z
UID:OWMADS/12
DESCRIPTION:Title:
Noise-level robust Monte Carlo methods for Bayesian inference with infomat
ive data\nby Björn Sprungk (TU Freiberg\, DE) as part of One World se
minar: Mathematical Methods for Arbitrary Data Sources (MADS)\n\n\nAbstrac
t\nThe Bayesian approach to inverse problems provides a rigorous framework
for the incorporation and quantification of uncertainties in measurements
\, parameters and models. However\, sampling from or integrating w.r.t. th
e resultung posterior measure can become computationally challenging. In r
ecent years\, a lot of effort has been spent on deriving dimension-indepen
dent methods and to combine efficient sampling strategies with multilevel
or surrogate methods in order to reduce the computational burden of Bayesi
an inverse problems.\nIn this talk\, we are interested in designing numeri
cal methods which are robust w.r.t. the size of the observational noise\,
i.e.\, methods which behave well in case of concentrated posterior measure
s. The concentration of the posterior is a highly desirable situation in p
ractice\, since it relates to informative or large data. However\, it can
pose as well a significant computational challenge for numerical methods b
ased on the prior or reference measure. We propose to employ the Laplace a
pproximation of the posterior as the base measure for numerical integratio
n in this context. The Laplace approximation is a Gaussian measure centere
d at the maximum a-posteriori estimate (MAPE) and with covariance matrix d
epending on the Hessian of the log posterior density at the MAPE. We discu
ss convergence results of the Laplace approximation in terms of the Hellin
ger distance and analyze the efficiency of Monte Carlo methods based on it
. In particular\, we show that Laplace-based importance sampling and quasi
-Monte-Carlo as well as Laplace-based Metropolis-Hastings algorithms are r
obust w.r.t. the concentration of the posterior for large classes of poste
rior distributions and integrands whereas prior-based Monte Carlo sampling
methods are not.\n
LOCATION:https://researchseminars.org/talk/OWMADS/12/
END:VEVENT
END:VCALENDAR