Computational imaging is a rapidly grow ing area that seeks to enhance the capabilities of imaging instruments by viewing imaging as an inverse problem. There are currently two distinct ap proaches for designing computational imaging methods: model-based and lear ning-based. Model-based methods leverage analytical signal properties and often come with theoretical guarantees and insights. Learning-based method s leverage data-driven representations for best empirical performance thro ugh training on large datasets. This talk presents Regularization by Artif act Removal (RARE)\, as a framework for reconciling both viewpoints by pro viding a learning-based extension to the classical theory. RARE relies on pre-trained “artifact-removing deep neural nets” for infusing learned prior knowledge into an inverse problem\, while maintaining a clear separa tion between the prior and physics-based acquisition model. Our results indi cate that RARE can achieve state-of-the-art performance in different compu tational imaging tasks\, while also being amenable to rigorous theoretical analysis. We will focus on the applications of RARE in biomedical imaging \, including magnetic resonance and tomographic imaging.
\n\nThis talk will be based on the following references
\n\n\;
\n LOCATION:https://researchseminars.org/talk/MPML/46/ END:VEVENT BEGIN:VEVENT SUMMARY:Ruth Misener (Imperial College London) DTSTART;VALUE=DATE-TIME:20210618T130000Z DTEND;VALUE=DATE-TIME:20210618T140000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/47 DESCRIPTION:Title: Pa rtition-based formulations for mixed-integer optimization of trained ReLU neural networks\nby Ruth Misener (Imperial College London) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nT his work develops a class of relaxations in between the big-M and convex h ull formulations of disjunctions\, drawing advantages from both. We show t hat this class leads to mixed-integer formulations for trained ReLU neural networks. The approach balances model size and tightness by partitioning node inputs into a number of groups and forming the convex hull over the p artitions via disjunctive programming. At one extreme\, one partition per input recovers the convex hull of a node\, i.e.\, the tightest possible fo rmulation for each node. For fewer partitions\, we develop smaller relaxat ions that approximate the convex hull\, and show that they outperform exis ting formulations. Specifically\, we propose strategies for partitioning v ariables based on theoretical motivations and validate these strategies us ing extensive computational experiments. Furthermore\, the proposed scheme complements known algorithmic approaches\, e.g.\, optimization-based boun d tightening captures dependencies within a partition.\n\nThis joint work with Calvin Tsay\, Jan Kronqvist\, Alexander Thebelt is based on two paper s: https://arxiv.org/abs/2102.04373 & https://arxiv.org/abs/2101.12708\n LOCATION:https://researchseminars.org/talk/MPML/47/ END:VEVENT BEGIN:VEVENT SUMMARY:Mathieu Blondel (Google Research\, Brain team\, Paris) DTSTART;VALUE=DATE-TIME:20210604T130000Z DTEND;VALUE=DATE-TIME:20210604T140000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/48 DESCRIPTION:Title: Ef ficient and Modular Implicit Differentiation\nby Mathieu Blondel (Goog le Research\, Brain team\, Paris) as part of Mathematics\, Physics and Mac hine Learning (IST\, Lisbon)\n\n\nAbstract\nAutomatic differentiation (aut odiff) has revolutionized machine learning. It allows expressing complex c omputations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently\, differentia tion of optimization problem solutions has attracted widespread attention with applications such as optimization as a layer\, and in bi-level proble ms such as hyper-parameter optimization and meta-learning. However\, the f ormulas for these derivatives often involve case-by-case tedious mathemati cal derivations. In this work\, we propose a unified\, efficient and modul ar approach for implicit differentiation of optimization problems. In our approach\, the user defines (in Python in the case of our implementation) a function F capturing the optimality conditions of the problem to be diff erentiated. Once this is done\, we leverage autodiff of F and implicit dif ferentiation to automatically differentiate the optimization problem. Our approach thus combines the benefits of implicit differentiation and autodi ff. It is efficient as it can be added on top of any state-of-the-art solv er and modular as the optimality condition specification is decoupled from the implicit differentiation mechanism. We show that seemingly simple pri nciples allow to recover many recently proposed implicit differentiation m ethods and create new ones easily. We demonstrate the ease of formulating and solving bi-level optimization problems using our framework. We also sh owcase an application to the sensitivity analysis of molecular dynamics.\n LOCATION:https://researchseminars.org/talk/MPML/48/ END:VEVENT BEGIN:VEVENT SUMMARY:Yuejie Chi (Carnegie Mellon University) DTSTART;VALUE=DATE-TIME:20210625T130000Z DTEND;VALUE=DATE-TIME:20210625T140000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/49 DESCRIPTION:Title: Po licy Optimization in Reinforcement Learning: A Tale of Preconditioning and Regularization\nby Yuejie Chi (Carnegie Mellon University) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\n Policy optimization\, which learns the policy of interest by maximizing th e value function via large-scale optimization techniques\, lies at the hea rt of modern reinforcement learning (RL). In addition to value maximizatio n\, other practical considerations arise commonly as well\, including the need of encouraging exploration\, and that of ensuring certain structural properties of the learned policy due to safety\, resource and operational constraints. These considerations can often be accounted for by resorting to regularized RL\, which augments the target value function with a struct ure-promoting regularization term\, such as Shannon entropy\, Tsallis entr opy\, and log-barrier functions. Focusing on an infinite-horizon discounte d Markov decision process\, this talk first shows that entropy-regularized natural policy gradient methods converge globally at a linear convergence that is near independent of the dimension of the state-action space. Next \, a generalized policy mirror descent algorithm is proposed to accommodat e a general class of convex regularizers beyond Shannon entropy. Encouragi ngly\, this general algorithm inherits similar convergence guarantees\, ev en when the regularizer lacks strong convexity and smoothness. Our results accommodate a wide range of learning rates\, and shed light upon the role of regularization in enabling fast convergence in RL.\n LOCATION:https://researchseminars.org/talk/MPML/49/ END:VEVENT BEGIN:VEVENT SUMMARY:Ard Louis (University of Oxford) DTSTART;VALUE=DATE-TIME:20210702T130000Z DTEND;VALUE=DATE-TIME:20210702T140000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/50 DESCRIPTION:Title: De ep neural networks have an inbuilt Occam's razor\nby Ard Louis (Univer sity of Oxford) as part of Mathematics\, Physics and Machine Learning (IST \, Lisbon)\n\n\nAbstract\nOne of the most surprising properties of deep ne ural networks (DNNs) is that they perform best in the overparameterized re gime. We are taught early on that having more parameters than data points is a terrible idea. So why do DNNs work so well in a regime where classica l learning theory predicts they should heavily overfit? By adapting the co ding theorem from algorithmic information theory (which every physicist sh ould learn about!) we show that DNNs are exponentially biased at initialis ation to functions that have low descriptional (Kolmogorov) complexity. In other words\, DNNs have an inbuilt Occam's razor\, a bias towards simple functions. We next show that stochastic gradient descent (SGD)\, the most popular optimisation method for DNNs\, follows the same bias\, and so does not itself explain the good generalisation of DNNs. Our approach naturall y leads to a marginal-likelihood PAC-Bayes generalisation bound which perf orms better than any other bounds on the market. Finally\, we discuss why this bias towards simplicity allows DNNs to perform so well\, and speculat e on what this may tell us about the natural world.\n LOCATION:https://researchseminars.org/talk/MPML/50/ END:VEVENT BEGIN:VEVENT SUMMARY:Usman Khan (Tufts University) DTSTART;VALUE=DATE-TIME:20210709T130000Z DTEND;VALUE=DATE-TIME:20210709T140000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/51 DESCRIPTION:Title: Di stributed ML: Optimal algorithms for distributed stochastic non-convex opt imization\nby Usman Khan (Tufts University) as part of Mathematics\, P hysics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nIn many emerging applications\, it is of paramount interest to learn hidden parameters fro m data. For example\, self-driving cars may use onboard cameras to identif y pedestrians\, highway lanes\, or traffic signs in various light and weat her conditions. Problems such as these can be framed as classification\, r egression\, or risk minimization in general\, at the heart of which lies s tochastic optimization and machine learning. In many practical scenarios\, distributed and decentralized learning methods are preferable as they ben efit from a divide-and-conquer approach towards data at the expense of loc al (short-range) communication. In this talk\, I will present our recent w ork that develops a novel algorithmic framework to address various aspects of decentralized stochastic first-order optimization methods for non-conv ex problems. A major focus will be to characterize regimes where decentral ized solutions outperform their centralized counterparts and lead to optim al convergence guarantees. Moreover\, I will characterize certain desirabl e attributes of decentralized methods in the context of linear speedup and networkindependent convergence rates. Throughout the talk\, I will demons trate such key aspects of the proposed methods with the help of provable t heoretical results and numerical experiments on real data.\n LOCATION:https://researchseminars.org/talk/MPML/51/ END:VEVENT BEGIN:VEVENT SUMMARY:Simon Du (University of Washington) DTSTART;VALUE=DATE-TIME:20210728T160000Z DTEND;VALUE=DATE-TIME:20210728T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/52 DESCRIPTION:Title: Pr ovable Representation Learning\nby Simon Du (University of Washington) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\ nAbstract\nRepresentation learning has been widely used in many applicatio ns. In this talk\, I will present our work\, which uncovers when and why r epresentation learning provably improves the sample efficiency\, from a st atistical learning point of view. I will show 1) the existence of a good r epresentation among all tasks\, and 2) the diversity of tasks are key cond itions that permit improved statistical efficiency via multi-task represen tation learning. These conditions provably improve the sample efficiency f or functions with certain complexity measures as the representation. If ti me permits\, I will also talk about leveraging the theoretical insights to improve practical performance.\n LOCATION:https://researchseminars.org/talk/MPML/52/ END:VEVENT BEGIN:VEVENT SUMMARY:J. Nathan Kutz (University of Washington) DTSTART;VALUE=DATE-TIME:20210916T160000Z DTEND;VALUE=DATE-TIME:20210916T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/53 DESCRIPTION:Title: De ep learning for the discovery of parsimonious physics models\nby J. Na than Kutz (University of Washington) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nA major challenge in the st udy of dynamical systems is that of model discovery: turning data into red uced order models that are not just predictive\, but provide insight into the nature of the underlying dynamical system that generated the data. We introduce a number of data-driven strategies for discovering nonlinear mul tiscale dynamical systems and their embeddings from data. We consider two canonical cases: (i) systems for which we have full measurements of the go verning variables\, and (ii) systems for which we have incomplete measurem ents. For systems with full state measurements\, we show that the recent s parse identification of nonlinear dynamical systems (SINDy) method can dis cover governing equations with relatively little data and introduce a samp ling method that allows SINDy to scale efficiently to problems with multip le time scales\, noise and parametric dependencies. For systems with inc omplete observations\, we show that the Hankel alternative view of Koopman (HAVOK) method\, based on time-delay embedding coordinates and the dynami c mode decomposition\, can be used to obtain a linear models and Koopman i nvariant measurement systems that nearly perfectly captures the dynamics o f nonlinear quasiperiodic systems. Neural networks are used in targeted wa ys to aid in the model reduction process. Together\, these approaches prov ide a suite of mathematical strategies for reducing the data required to d iscover and model nonlinear multiscale systems.\n LOCATION:https://researchseminars.org/talk/MPML/53/ END:VEVENT BEGIN:VEVENT SUMMARY:Leong Chuan Kwek (Nanyang Technological University\, Singapore) DTSTART;VALUE=DATE-TIME:20210923T090000Z DTEND;VALUE=DATE-TIME:20210923T100000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/54 DESCRIPTION:Title: Ma chine Learning and Quantum Technology\nby Leong Chuan Kwek (Nanyang Te chnological University\, Singapore) as part of Mathematics\, Physics and M achine Learning (IST\, Lisbon)\n\n\nAbstract\nThe rise of machine learning in recent times has remarkably transformed science and society. The goal of machine learning is to get computers to act without being explicitly pr ogrammed. Machine learning with deep reinforcement learning (RL) was recen tly recognized as a powerful tool to engineer dynamics in quantum system. Also\, recently there has been some interest to exploit and leverage the l imited available quantum resources for performing classically challenging tasks with noisy intermediate-scale quantum (NISQ) computers. Here\, we di scuss some of our recent work on the applications of machine learning to q uantum systems.\n LOCATION:https://researchseminars.org/talk/MPML/54/ END:VEVENT BEGIN:VEVENT SUMMARY:Constantino Tsallis (Group of Statistical Physics\, CBPF and Santa Fe Institute) DTSTART;VALUE=DATE-TIME:20211021T160000Z DTEND;VALUE=DATE-TIME:20211021T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/55 DESCRIPTION:Title: St atistical mechanics for complex systems\nby Constantino Tsallis (Group of Statistical Physics\, CBPF and Santa Fe Institute) as part of Mathemat ics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nTogether with Newtonian mechanics\, Maxwell electromagnetism\, Einstein relativity and quantum mechanics\, Boltzmann-Gibbs (BG) statistical mechanics constit utes one of the pillars of contemporary theoretical physics\, with uncount able applications in science and technology. This theory applies formidabl y well to a plethora of physical systems. Still\, it fails in the realm of complex systems\, characterized by generically strong space-time entangle ment of their elements. On the basis of a nonadditive entropy (defined by an index q\, which recovers\, for q=1\, the celebrated Boltzmann-Gibbs-von Neumann-Shannon entropy)\, it is possible to generalize the BG theory. We will briefly review the foundations and applications in natural\, artific ial and social systems.\n\nA Bibliography is available at http://tsallis.c at.cbpf.br/biblio.htm\n LOCATION:https://researchseminars.org/talk/MPML/55/ END:VEVENT BEGIN:VEVENT SUMMARY:Volkan Cevher (Laboratory for Information and Inference Systems – LIONS\, EPFL) DTSTART;VALUE=DATE-TIME:20210930T160000Z DTEND;VALUE=DATE-TIME:20210930T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/56 DESCRIPTION:Title: Op timization Challenges in Adversarial Machine Learning\nby Volkan Cevhe r (Laboratory for Information and Inference Systems – LIONS\, EPFL) as p art of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbst ract\nThanks to neural networks (NNs)\, faster computation\, and massive d atasets\, machine learning (ML) is under increasing pressure to provide au tomated solutions to even harder real-world tasks beyond human performance with ever faster response times due to potentially huge technological and societal benefits. Unsurprisingly\, the NN learning formulations present a fundamental challenge to the back-end learning algorithms despite their scalability\, in particular due to the existence of traps in the non-conve x optimization landscape\, such as saddle points\, that can prevent algori thms from obtaining “good” solutions.\n\nIn this talk\, we describe ou r recent research that has demonstrated that the non-convex optimization d ogma is false by showing that scalable stochastic optimization algorithms can avoid traps and rapidly obtain locally optimal solutions. Coupled with the progress in representation learning\, such as over-parameterized neur al networks\, such local solutions can be globally optimal.\n\nUnfortunate ly\, this talk will also demonstrate that the central min-max optimization problems in ML\, such as generative adversarial networks (GANs)\, robust reinforcement learning (RL)\, and\ndistributionally robust ML\, contain sp urious attractors that do not include any stationary points of the origina l learning formulation. Indeed\, we will describe how algorithms are subje ct to a grander challenge\, including unavoidable convergence failures\, w hich could explain the stagnation in their progress despite the impressive earlier demonstrations.\n LOCATION:https://researchseminars.org/talk/MPML/56/ END:VEVENT BEGIN:VEVENT SUMMARY:Clément Hongler (EPFL) DTSTART;VALUE=DATE-TIME:20211014T160000Z DTEND;VALUE=DATE-TIME:20211014T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/57 DESCRIPTION:Title: Ne ural Tangent Kernel\nby Clément Hongler (EPFL) as part of Mathematics \, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe Neural T angent Kernel is a new way to understand the gradient descent in deep neur al networks\, connecting them with kernel methods. In this talk\, I'll int roduce this formalism and give a number of results on the Neural Tangent K ernel and explain how they give us insight into the dynamics of neural net works during training and into their generalization features.\n\nBased off joint works with Arthur Jacot and Franck Gabriel.\n LOCATION:https://researchseminars.org/talk/MPML/57/ END:VEVENT BEGIN:VEVENT SUMMARY:George Em Karniadakis (Brown University) DTSTART;VALUE=DATE-TIME:20211104T170000Z DTEND;VALUE=DATE-TIME:20211104T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/58 DESCRIPTION:Title: Op erator regression via DeepOnet: Theory\, Algorithms and Applications\n by George Em Karniadakis (Brown University) as part of Mathematics\, Physi cs and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nWe will review physi cs-informed neural network and summarize available theoretical results. We will also introduce new NNs that learn functionals and nonlinear operator s from functions and corresponding responses for system identification. Th e universal approximation theorem of operators is suggestive of the potent ial of NNs in learning from scattered data any continuous operator or comp lex system. We first generalize the theorem to deep neural networks\, and subsequently we apply it to design a new composite NN with small generaliz ation error\, the deep operator network (DeepONet)\, consisting of a NN fo r encoding the discrete input function space (branch net) and another NN f or encoding the domain of the output functions (trunk net). We demonstrate that DeepONet can learn various explicit operators\, e.g.\, integrals\, L aplace transforms and fractional Laplacians\, as well as implicit operator s that represent deterministic and stochastic differential equations. More generally\, DeepOnet can learn multiscale operators spanning across many scales and trained by diverse sources of data simultaneously.\n LOCATION:https://researchseminars.org/talk/MPML/58/ END:VEVENT BEGIN:VEVENT SUMMARY:Michael Arbel (INRIA Grenoble Rhône-Alpes) DTSTART;VALUE=DATE-TIME:20211111T170000Z DTEND;VALUE=DATE-TIME:20211111T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/59 DESCRIPTION:Title: An nealed Flow Transport Monte Carlo\nby Michael Arbel (INRIA Grenoble Rh ône-Alpes) as part of Mathematics\, Physics and Machine Learning (IST\, L isbon)\n\n\nAbstract\nAnnealed Importance Sampling (AIS) and its Sequentia l Monte Carlo (SMC) extensions are state-of-the-art methods for estimating normalizing constants of probability distributions. We propose here a nov el Monte Carlo algorithm\, Annealed Flow Transport (AFT)\, that builds upo n AIS and SMC and combines them with normalizing flows (NF) for improved p erformance. This method transports a set of particles using not only impor tance sampling (IS)\, Markov chain Monte Carlo (MCMC) and resampling steps - as in SMC\, but also relies on NF which are learned sequentially to pus h particles towards the successive annealed targets. We provide limit theo rems for the resulting Monte Carlo estimates of the normalizing constant a nd expectations with respect to the target distribution. Additionally\, we show that a continuous-time scaling limit of the population version of AF T is given by a Feynman--Kac measure which simplifies to the law of a cont rolled diffusion for expressive NF. We demonstrate experimentally the bene fits and limitations of our methodology on a variety of applications.\n LOCATION:https://researchseminars.org/talk/MPML/59/ END:VEVENT BEGIN:VEVENT SUMMARY:Soledad Villar (Mathematical Institute for Data Science at Johns H opkins University) DTSTART;VALUE=DATE-TIME:20211202T170000Z DTEND;VALUE=DATE-TIME:20211202T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/60 DESCRIPTION:Title: Eq uivariant machine learning structure like classical physics\nby Soleda d Villar (Mathematical Institute for Data Science at Johns Hopkins Univers ity) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\ n\n\nAbstract\nThere has been enormous progress in the last few years in d esigning neural networks that respect the fundamental symmetries and coord inate freedoms of physical law. Some of these frameworks make use of irred ucible representations\, some make use of high-order tensor objects\, and some apply symmetry-enforcing constraints. Different physical laws obey di fferent combinations of fundamental symmetries\, but a large fraction (pos sibly all) of classical physics is equivariant to translation\, rotation\, reflection (parity)\, boost (relativity)\, and permutations. Here we show that it is simple to parameterize universally approximating polynomial fu nctions that are equivariant under these symmetries\, or under the Euclide an\, Lorentz\, and Poincare groups\, at any dimensionality d. The key obse rvation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be expressed in terms of a lightweight collection of scalar s---scalar products and scalar contractions of the scalar\, vector\, and t ensor inputs. These results demonstrate theoretically that gauge-invariant deep learning models for classical physics with good scaling for large pr oblems are feasible right now.\n LOCATION:https://researchseminars.org/talk/MPML/60/ END:VEVENT BEGIN:VEVENT SUMMARY:Pier Luigi Dragotti (Department of Electrical and Electronic Engin eering\, Imperial College\, London) DTSTART;VALUE=DATE-TIME:20211209T170000Z DTEND;VALUE=DATE-TIME:20211209T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/61 DESCRIPTION:Title: Co mputational Imaging for Art investigation and for Neuroscience\nby Pie r Luigi Dragotti (Department of Electrical and Electronic Engineering\, Im perial College\, London) as part of Mathematics\, Physics and Machine Lear ning (IST\, Lisbon)\n\n\nAbstract\nThe revolution in sensing\, with the em ergence of many new imagingtechniques\, offers the possibility of gaining unprecedented access tothe physical world\, but this revolution can only b ear fruit through the skilful interplay between the physical and computati onal worlds. This is the domain of computational imaging which advocates t hat\, to develop effective imaging systems\, it will be necessary to go be yond the traditional decoupled imaging pipeline where device physics\, ima ge processing and the end-user application are considered separately. Inst ead\, we need to rethink imaging as an integrated sensing and inference mo del. In this talk we cover two research areas where computational imaging is likely to have an impact.\n\nWe first focus on the heritage sector whic h is experiencing a digital revolution driven in part by the increasing us e of non-invasive\, non-destructive imaging techniques. These new imaging methods provide a way to capture information about an entire painting and can give us information about features at or below the surface of the pain ting. We focus on Macro X-Ray Fluorescence (XRF) scanning which is a techn ique for the mapping of chemical elements in paintings. After describing i n broad terms the working of this device\, a method that can process XRF s canning data from paintings is introduced. The method is based on connecti ng the problem of extracting elemental maps in XRF data to Prony's method\ , a technique broadly used in engineering to estimate frequencies of a sum of sinusoids. The results presented show the ability of our method to det ect and separate weak signals related to hidden chemical elements in the p aintings. We then discuss results on the Leonardo's "The Virgin of the Roc ks" and show that our algorithm is able to reveal\, more clearly than ever before\, the hidden drawings of a previous composition that Leonardo then abandoned for the painting that we can now see.\n\nIn the second part of the talk\, we focus on two-photon microscopy and neuroscience. To understa nd how networks of neurons process information\, it is essential to monito r their activity in living tissue. Multi-photon microscopy is unparalleled in its ability to image cellular activity and neural circuits\, deep in l iving tissue\, at single-cell resolution. However\, in order to achieve st ep changes in our understanding of brain function\, large-scale imaging st udies of neural populations are needed and this can be achieved only by de veloping computational tools that can enhance the quality of the data acqu ired and can scan 3-D volumes quickly. In this talk we introduce light-fie ld microscopy and present a method to localize neurons in 3-D. The method is based on the use of proper sparsity priors\, novel optimization strateg ies and machine learning.\n\n\nThis is joint work with A. Foust\, P. Song\ , C. Howe\, H. Verinaz\, J. Huang and Y.Su from Imperial College London\, and C. Higgitt and N. Daly from The National Gallery in London\n LOCATION:https://researchseminars.org/talk/MPML/61/ END:VEVENT BEGIN:VEVENT SUMMARY:Suman Ravuri (DeepMind) DTSTART;VALUE=DATE-TIME:20211125T170000Z DTEND;VALUE=DATE-TIME:20211125T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/62 DESCRIPTION:Title: Sk ilful precipitation nowcasting using deep generative models of radar\n by Suman Ravuri (DeepMind) as part of Mathematics\, Physics and Machine Le arning (IST\, Lisbon)\n\n\nAbstract\nPrecipitation nowcasting\, the high-r esolution forecasting of precipitation up to two hours ahead\, supports th e real-world socioeconomic needs of many sectors reliant on weather-depend ent decision-making. State-of-the-art operational nowcasting methods typic ally advect precipitation fields with radar-based wind estimates\, and str uggle to capture important non-linear events such as convective initiation s. Recently introduced deep learning methods use radar to directly predict future rain rates\, free of physical constraints. While they accurately p redict low-intensity rainfall\, their operational utility is limited becau se their lack of constraints produces blurry nowcasts at longer lead times \, yielding poor performance on rarer medium-to-heavy rain events. Here we present a deep generative model for the probabilistic nowcasting of preci pitation from radar that addresses these challenges. Using statistical\, e conomic and cognitive measures\, we show that our method provides improved forecast quality\, forecast consistency and forecast value. Our model pro duces realistic and spatiotemporally consistent predictions over regions u p to 1\,536 km × 1\,280 km and with lead times from 5–90 min ahead. Using a systematic evaluation by more than 50 expert meteorologists \, we show that our generative model ranked first for its accuracy and use fulness in 89% of cases against two competitive methods. When verified qua ntitatively\, these nowcasts are skillful without resorting to blurring. W e show that generative nowcasting can provide probabilistic predictions th at improve forecast value and support operational utility\, and at resolut ions and lead times where alternative methods struggle.\n LOCATION:https://researchseminars.org/talk/MPML/62/ END:VEVENT BEGIN:VEVENT SUMMARY:Dan Roberts (MIT\, Center for Theoretical Physics) DTSTART;VALUE=DATE-TIME:20220113T170000Z DTEND;VALUE=DATE-TIME:20220113T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/63 DESCRIPTION:Title: Th e Principles of Deep Learning Theory\nby Dan Roberts (MIT\, Center for Theoretical Physics) as part of Mathematics\, Physics and Machine Learnin g (IST\, Lisbon)\n\n\nAbstract\nDeep learning is an exciting approach to m odern artificial intelligence based on artificial neural networks. The goa l of this talk is to provide a blueprint — using tools from physics — for theoretically analyzing deep neural networks of practical relevance. T his task will encompass both understanding the statistics of initialized d eep networks and determining the training dynamics of such an ensemble whe n learning from data.\n\nThis talk is based on a book\, "The Principles of Deep Learning Theory\," co-authored with Sho Yaida and based on research also in collaboration w ith Boris Hanin. It will be published next year by Cambridge University Pr ess.\n LOCATION:https://researchseminars.org/talk/MPML/63/ END:VEVENT BEGIN:VEVENT SUMMARY:Anders Hansen (Faculty of Mathematics and Department of Applied Ma thematics and Theoretical Physics\, University of Cambridge) DTSTART;VALUE=DATE-TIME:20220120T170000Z DTEND;VALUE=DATE-TIME:20220120T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/64 DESCRIPTION:Title: Wh y things don’t work — On the extended Smale's 9th and 18th problems (t he limits of AI) and methodological barriers\nby Anders Hansen (Facult y of Mathematics and Department of Applied Mathematics and Theoretical Phy sics\, University of Cambridge) as part of Mathematics\, Physics and Machi ne Learning (IST\, Lisbon)\n\n\nAbstract\nThe alchemists wanted to create gold\, Hilbert wanted an algorithm to solve Diophantine equations\, resear chers want to make deep learning robust in AI\, MATLAB wants (but fails) t o detect when it provides wrong solutions to linear programs etc. Why does one not succeed in so many of these fundamental cases? The reason is typi cally methodological barriers. The history of science is full of methodolo gical barriers — reasons for why we never succeed in reaching certain go als. In many cases\, this is due to the foundations of mathematics. We wil l present a new program on methodological barriers and foundations of math ematics\, where — in this talk — we will focus on two basic problems: (1) The instability problem in deep learning: Why do researchers fail to p roduce stable neural networks in basic classification and computer vision problems that can easily be handled by humans — when one can prove that there exist stable and accurate neural networks? Moreover\, AI algorithms can typically not detect when they are wrong\, which becomes a serious iss ue when striving to create trustworthy AI. The problem is more general\, a s for example MATLAB's linprog routine is incapable of certifying correct solutions of basic linear programs. Thus\, we’ll address the following q uestion: (2) Why are algorithms (in AI and computations in general) incapa ble of determining when they are wrong? These questions are deeply connect ed to the extended Smale’s 9th and 18th problems on the list of mathemat ical problems for the 21st century.\n LOCATION:https://researchseminars.org/talk/MPML/64/ END:VEVENT BEGIN:VEVENT SUMMARY:Joosep Pata (National Institute of Chemical Physics and Biophysics \, Estonia) DTSTART;VALUE=DATE-TIME:20220203T170000Z DTEND;VALUE=DATE-TIME:20220203T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/65 DESCRIPTION:Title: Ma chine learning for data reconstruction at the LHC\nby Joosep Pata (Nat ional Institute of Chemical Physics and Biophysics\, Estonia) as part of M athematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nPh ysics analyses at the CERN experiments rely on detector hits being interpr eted or reconstructed as particle candidates. The data reconstruction syst ems are built on decades of physics and detector knowledge and must operat e reliably on petabytes of data in diverse computing centers spread around the world. In the recent years\, machine learning (ML) is playing an incr easingly important role at the LHC experiments for reconstructing and inte rpreting the data\, from calibrating the detector readouts to the final in terpretation for complex signal processes. We will discuss the various asp ects of ML at the LHC experiments\, focusing on data reconstruction and pa rticle identification approaches using modern machine learning methods suc h as graph neural networks. We will bring a concrete detailed example from machine learned particle flow (MLPF)\, an R&D effort to develop a fully o ptimizable particle flow reconstruction across detector subsystems in CMS. \n LOCATION:https://researchseminars.org/talk/MPML/65/ END:VEVENT BEGIN:VEVENT SUMMARY:Jan Kieseler (European Organization for Nuclear Research (CERN)) DTSTART;VALUE=DATE-TIME:20220303T170000Z DTEND;VALUE=DATE-TIME:20220303T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/66 DESCRIPTION:Title: Th e MODE project\nby Jan Kieseler (European Organization for Nuclear Res earch (CERN)) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe effective design of instruments that rely on t he interaction of radiation with matter for their operation is a complex t ask. Furthermore\, the underlying physics processes are intrinsically stoc hastic in nature and open a vast space of possible choices for the physica l characteristics of the instrument. While even large scale detectors such as e.g. at the LHC are built using surrogates for the ultimate physics ob jective\, the MODE Collaboration (an acronym for Machine-learning Optimize d Design of Experiments) aims at developing tools also based on deep learn ing techniques to achieve end-to-end optimization of the design of instrum ents via a fully differentiable pipeline capable of exploring the Pareto-o ptimal frontier of the utility function for future particle collider exper iments and related detectors. The construction of such a differentiable mo del requires inclusion of information-extraction procedures\, including da ta collection\, detector response\, pattern recognition\, and other existi ng constraints such as cost. This talk will give an introduction to the go als of the newly founded MODE collaboration and highlight some of the alre ady existing ingredients.\n LOCATION:https://researchseminars.org/talk/MPML/66/ END:VEVENT BEGIN:VEVENT SUMMARY:Fernando E. Rosas (Faculty of Medicine\, Department of Brain Scien ces\, Imperial College) DTSTART;VALUE=DATE-TIME:20220324T170000Z DTEND;VALUE=DATE-TIME:20220324T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/67 DESCRIPTION:Title: To wards a deeper understanding of high-order interdependencies in complex sy stems\nby Fernando E. Rosas (Faculty of Medicine\, Department of Brain Sciences\, Imperial College) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nWe live in an increasingly interco nnected world and\, unfortunately\, our understanding of interdependency i s still limited. As a matter of fact\, while bivariated relationships are at the core of most of our data analysis methods\, there is still no princ ipled theory to account for the different types of interactions that can o ccur between three or more variables. This talk explores the vast and larg ely unexplored territory of multivariate complexity\, and discusses inform ation-theoretic approaches that have been introduced to fill this importan t knowledge gap.\n\nThe first part of the talk is devoted to synergistic p henomena\, which correspond to statistical regularities that affect the wh ole but not the parts. We explain how synergy can be effectively captured by information-theoretic measures inspired in the nature of high brain fun ctions\, and how these measures allow us to map complex interdependencies into hypergraphs. The second part of the talk focuses on a new theory of w hat constitutes causal emergence\, and how it can be measured from time se ries data. This theory enables a formal\, quantitative account of downward causation\, and introduces “causal decoupling” as a complementary mod ality of emergence. Importantly\, this not only establishes conceptual too ls to frame conjectures about emergence rigorously\, but also provides pra ctical procedures to test them on data. We illustrate the considered analy sis tools on different case studies\, including cellular automata\, baroqu e music\, flocking models\, and neuroimaging datasets.\n LOCATION:https://researchseminars.org/talk/MPML/67/ END:VEVENT BEGIN:VEVENT SUMMARY:Josef Urban (Czech Institute of of Informatics\, Robotics and Cybe rnetics (CIIRC)) DTSTART;VALUE=DATE-TIME:20220331T160000Z DTEND;VALUE=DATE-TIME:20220331T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/68 DESCRIPTION:Title: Ma chine Learning and Theorem Proving\nby Josef Urban (Czech Institute of of Informatics\, Robotics and Cybernetics (CIIRC)) as part of Mathematics \, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe talk wil l describe several ways in which machine learning is combined with theorem proving today over large corpora of formal proof. If time permits\, I wil l also show some demos of the systems and mention related topics such as M L-guided conjecturing and autoformalization.\n LOCATION:https://researchseminars.org/talk/MPML/68/ END:VEVENT BEGIN:VEVENT SUMMARY:André F. T. Martins (Instituto Superior Técnico) DTSTART;VALUE=DATE-TIME:20220224T163000Z DTEND;VALUE=DATE-TIME:20220224T173000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/69 DESCRIPTION:Title: Fr om Sparse Modeling to Sparse Communication\nby André F. T. Martins (I nstituto Superior Técnico) as part of Mathematics\, Physics and Machine L earning (IST\, Lisbon)\n\n\nAbstract\nNeural networks and other machine le arning models compute continuous representations\, while humans communicat e mostly through discrete symbols. Reconciling these two forms of communic ation is desirable for generating human-readable interpretations or learni ng discrete latent variable models\, while maintaining end-to-end differen tiability.\n\nIn the first part of the talk\, I will describe how sparse m odeling techniques can be extended and adapted for facilitating sparse com munication in neural models. The building block is a family of sparse tran sformations called alpha-entmax\, a drop-in replacement for softmax\, whic h contains sparsemax as a particular case. Entmax transformations are diff erentiable and (unlike softmax) they can return sparse probability distrib utions\, useful to build interpretable attention mechanisms. Variants of t hese sparse transformations have been applied with success to machine tran slation\, natural language inference\, visual question answering\, and oth er tasks.\n\nIn the second part\, I will introduce mixed random variables\ , which are in-between the discrete and continuous worlds. We build rigoro us theoretical foundations for these hybrids\, via a new “direct sum” base measure defined on the face lattice of the probability simplex. From this measure\, we introduce new entropy and Kullback-Leibler divergence fu nctions that subsume the discrete and differential cases and have interpre tations in terms of code optimality. Our framework suggests two strategies for representing and sampling mixed random variables\, an extrinsic (“s ample-and-project”) and an intrinsic one (based on face stratification). \n\nIn the third part\, I will show how sparse transformations can also be used to design new loss functions\, replacing the cross-entropy loss. To this end\, I will introduce the family of Fenchel-Young losses\, revealing connections between generalized entropy regularizers and separation margi n. I will illustrate with applications in natural language generation\, mo rphology\, and machine translation.\n\nThis work was funded by the DeepSPI N ERC project - https://deep-spin.github.io\n LOCATION:https://researchseminars.org/talk/MPML/69/ END:VEVENT BEGIN:VEVENT SUMMARY:Dmitry Krotov (Watson AI Lab and IBM Research in Cambridge) DTSTART;VALUE=DATE-TIME:20220414T160000Z DTEND;VALUE=DATE-TIME:20220414T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/70 DESCRIPTION:Title: Mo dern Hopfield Networks in AI and Neurobiology\nby Dmitry Krotov (Watso n AI Lab and IBM Research in Cambridge) as part of Mathematics\, Physics a nd Machine Learning (IST\, Lisbon)\n\n\nAbstract\nModern Hopfield Netwo rks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contr ast to conventional Hopfield Networks\, their modern versions have a very large memory storage capacity\, which makes them appealing tools for many problems in machine learning and cognitive and neuro-sciences. In this tal k I will introduce an intuition and a mathematical formulation of this cla ss of models\, and will give examples of problems in AI that can be tackle d using these new ideas. I will also explain how different individual mode ls of this class (e.g. hierarchical memories\, attention mechanism in tran sformers\, etc.) arise from their general mathematical formulation with th e Lagrangian functions.
\n\nReferences:
\n\n&n bsp\;
\n LOCATION:https://researchseminars.org/talk/MPML/70/ END:VEVENT BEGIN:VEVENT SUMMARY:Emtiyaz Khan (RIKEN-AIP\, Tokyo and OIST\, Okinawa\, Japan) DTSTART;VALUE=DATE-TIME:20220428T090000Z DTEND;VALUE=DATE-TIME:20220428T100000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/71 DESCRIPTION:Title: Th e Bayesian Learning Rule for Adaptive AI\nby Emtiyaz Khan (RIKEN-AIP\, Tokyo and OIST\, Okinawa\, Japan) as part of Mathematics\, Physics and Ma chine Learning (IST\, Lisbon)\n\n\nAbstract\nHumans and animals have a nat ural ability to autonomously learn and quickly adapt to their surroundings . How can we design AI systems that do the same? In this talk\, I will pre sent Bayesian principles to bridge such gaps between humans and AI. I will show that a wide variety of machine-learning algorithms are instances of a single learning-rule called the Bayesian learning rule. The rule unravel s a dual perspective yielding new adaptive mechanisms for machine-learning based AI systems. My hope is to convince the audience that Bayesian princ iples are indispensable for an AI that learns as efficiently as we do.\n\nReference: M.E. Khan\, H. Rue\, The Bayesian Learning Rule [arXiv] [Tweet]
\n LOCATION:https://researchseminars.org/talk/MPML/71/ END:VEVENT BEGIN:VEVENT SUMMARY:Rianne van den Berg (Microsoft Research Amsterdam) DTSTART;VALUE=DATE-TIME:20220421T160000Z DTEND;VALUE=DATE-TIME:20220421T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/72 DESCRIPTION:Title: Ge nerative models for discrete random variables\nby Rianne van den Berg (Microsoft Research Amsterdam) as part of Mathematics\, Physics and Machin e Learning (IST\, Lisbon)\n\n\nAbstract\nn this talk I will discuss how di fferent classes of generative models can be adapted to handle discrete ran dom variables\, and how this can be used to connect generative models to d ownstream tasks such as lossless compression. I will start by discussing n ormalizing flow models\, and the challenges that arise when converting the se models that are typically designed for real-valued random variables to discrete random variables. Next\, I will demonstrate how denoising diffusi on models with discrete state spaces have a rich design space in terms of the noising process\, and how this influences the performance of the learn ed denoising model. Finally\, I will show how denoising diffusion models c an be connected to autoregressive models\, and introduce an autoregressive model with a random generation order.\n LOCATION:https://researchseminars.org/talk/MPML/72/ END:VEVENT BEGIN:VEVENT SUMMARY:Andrea L. Bertozzi (University of California Los Angeles) DTSTART;VALUE=DATE-TIME:20220505T160000Z DTEND;VALUE=DATE-TIME:20220505T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/73 DESCRIPTION:Title: Gr aph based models in semi-supervised and unsupervised learning\nby Andr ea L. Bertozzi (University of California Los Angeles) as part of Mathemati cs\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nSimilarity graphs provide a structure for analyzing high dimensional data. These undirected weighted graphs provide structure for identifying inheren t clusters in datasets and many methods exist to sort through such data bu ilding on the graph laplacian matrix. One way to think about such proble ms is in terms of penalized cut problems. These can be expressed in term s of the graph total variation which has a well-known analogue in Euclidea n space. We show how to use ideas from geometric methods for PDEs to dev elop efficient and high performing methods for semi-supervised and unsuper vised learning. These methods also extend to active learning and to modu larity optimization for community detection on networks.\n LOCATION:https://researchseminars.org/talk/MPML/73/ END:VEVENT BEGIN:VEVENT SUMMARY:Stanley Osher (Department of Mathematics\, University of Californi a\, Los Angeles) DTSTART;VALUE=DATE-TIME:20220519T160000Z DTEND;VALUE=DATE-TIME:20220519T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/74 DESCRIPTION:Title: Co nservation laws and generalized optimal transport\nby Stanley Osher (D epartment of Mathematics\, University of California\, Los Angeles) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstrac t\nIn this talk\, we connect Lax’s entropy-entropy flux in conservation laws with optimal transport type metric spaces. Following this connection\ , we further design variational discretizations for conservation laws and mean field control of conservation laws. In particular\, we design uncondi tionally stable time discretization methods that are easy to implement.\n\ nOn joint work with Siting Liu\, UCLA and Wuchen Li\, University of South Carolina.\n LOCATION:https://researchseminars.org/talk/MPML/74/ END:VEVENT BEGIN:VEVENT SUMMARY:Anja Butter (ITP\, University of Heidelberg) DTSTART;VALUE=DATE-TIME:20220602T160000Z DTEND;VALUE=DATE-TIME:20220602T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/75 DESCRIPTION:Title: Ma chine Learning and LHC Event Generation\nby Anja Butter (ITP\, Univers ity of Heidelberg) as part of Mathematics\, Physics and Machine Learning ( IST\, Lisbon)\n\n\nAbstract\nFirst-principle simulations are at the heart of the high-energy physics research program. They link the vast data outpu t of multi-purpose detectors with fundamental theory predictions and inter pretation. In the coming LHC runs\, these simulations will face unpreceden ted precision requirements to match the experimental accuracy. New ideas a nd tools based on neural networks have been developed at the interface of particle physics and machine learning. They can improve the speed and prec ision of forward simulations and handle the complexity of collision data. Such networks can be employed within established simulation tools or as pa rt of a new framework. Since neural networks can be inverted\, they open n ew avenues in LHC analyses.\n LOCATION:https://researchseminars.org/talk/MPML/75/ END:VEVENT BEGIN:VEVENT SUMMARY:Paulo Tabuada (UCLA) DTSTART;VALUE=DATE-TIME:20220609T160000Z DTEND;VALUE=DATE-TIME:20220609T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/77 DESCRIPTION:Title: De ep neural networks\, universal approximation\, and geometric control\n by Paulo Tabuada (UCLA) as part of Mathematics\, Physics and Machine Learn ing (IST\, Lisbon)\n\n\nAbstract\nDeep neural networks have drastically ch anged the landscape of several engineering areas such as computer vision a nd natural language processing. Notwithstanding the widespread success of deep networks in these\, and many other areas\, it is still not well under stood why deep neural networks work so well. In particular\, the question of which functions can be learned by deep neural networks has remained una nswered.\nIn this talk we give an answer to this question for deep residua l neural networks\, a class of deep networks that can be interpreted as th e time discretization of nonlinear control systems. We will show that the ability of these networks to memorize training data can be expressed throu gh the control theoretic notion of controllability which can be proved usi ng geometric control techniques. We then add an additional ingredient\, mo notonicity\, to conclude that deep residual networks can approximate\, to arbitrary accuracy with respect to the uniform norm\, any continuous funct ion on a compact subset of n-dimensional Euclidean space by using at most n+1 neurons per layer. We will conclude the talk by showing how these resu lts pave the way for the use of deep networks in the perception pipeline o f autonomous systems while providing formal (and probability free) guarant ees of stability and robustness.\n LOCATION:https://researchseminars.org/talk/MPML/77/ END:VEVENT BEGIN:VEVENT SUMMARY:Petar Veličković (DeepMind and University of Cambridge) DTSTART;VALUE=DATE-TIME:20220929T160000Z DTEND;VALUE=DATE-TIME:20220929T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/78 DESCRIPTION:Title: Ge ometric Deep Learning: Grids\, Graphs\, Groups\, Geodesics and Gauges\ nby Petar Veličković (DeepMind and University of Cambridge) as part of M athematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nTh e last decade has witnessed an experimental revolution in data science and machine learning\, epitomised by deep learning methods. Indeed\, many hig h-dimensional learning tasks previously thought to be beyond reach –such as computer vision\, playing Go\, or protein folding – are in fact feas ible with appropriate computational scale. Remarkably\, the essence of dee p learning is built from two simple algorithmic principles: first\, the no tion of representation or feature learning\, whereby adapted\, often hiera rchical\, features capture the appropriate notion of regularity for each t ask\, and second\, learning by local gradient-descent type methods\, typic ally implemented as backpropagation.\n\nWhile learning generic functions i n high dimensions is a cursed estimation problem\, most tasks of interest are not generic\, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical worl d. This talk is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of ap plications.\n\nSuch a 'geometric unification' endeavour in the spirit of F elix Klein's Erlangen Program serves a dual purpose: on one hand\, it prov ides a common mathematical framework to study the most successful neural n etwork architectures\, such as CNNs\, RNNs\, GNNs\, and Transformers. On t he other hand\, it gives a constructive procedure to incorporate prior phy sical knowledge into neural architectures and provide principled way to bu ild future architectures yet to be invented.\n LOCATION:https://researchseminars.org/talk/MPML/78/ END:VEVENT BEGIN:VEVENT SUMMARY:Yongji Wang (Department of Geosciences\, Princeton University) DTSTART;VALUE=DATE-TIME:20220526T160000Z DTEND;VALUE=DATE-TIME:20220526T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/79 DESCRIPTION:Title: Ph ysics-informed neural networks for solving 3-D Euler equation\nby Yong ji Wang (Department of Geosciences\, Princeton University) as part of Math ematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nOne o f the most challenging open questions in mathematical fluid dynamics is wh ether an inviscid incompressible fluid\, described by the 3-dimensional Eu ler equations\, with initially smooth velocity and finite energy can devel op singularities in finite time. This long-standing open problem is closel y related to one of the seven Millennium Prize Problems which considers th e problem the viscous analogue to the Euler equations (the Navier-Stokes e quations). In this talk\, I will describe how we leverage the power of dee p learning\, using deep neural networks with equation constraints\, namely physics-informed neural networks (PINNs)\, to find a smooth self-similar blow-up solution for the 3-dimensional Euler equations in the presence of a cylindrical boundary. To the best of our knowledge\, the solution repres ents the first example of a truly 2-D or higher dimensional backwards self -similar solution. This new numerical framework based on PINNs is shown to be robust and readily adaptable to other fluid equations\, which sheds ne w light to the century-old mystery of capital importance in the field of m athematical fluid dynamics.\n LOCATION:https://researchseminars.org/talk/MPML/79/ END:VEVENT BEGIN:VEVENT SUMMARY:John Baez (U.C. Riverside) DTSTART;VALUE=DATE-TIME:20220616T170000Z DTEND;VALUE=DATE-TIME:20220616T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/80 DESCRIPTION:Title: Sh annon Entropy from Category Theory\nby John Baez (U.C. Riverside) as p art of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbst ract\nShannon entropy is a powerful concept. But what properties single ou t Shannon entropy as special? Instead of focusing on the entropy of a prob ability measure on a finite set\, it can help to focus on the "information loss"\, or change in entropy\, associated with a measure-preserving funct ion. Shannon entropy then gives the only concept of information loss that is functorial\, convex-linear and continuous.\n\nThis is joint work with T om Leinster and Tobias Fritz.\n LOCATION:https://researchseminars.org/talk/MPML/80/ END:VEVENT BEGIN:VEVENT SUMMARY:Dario Izzo (European Space Agency) DTSTART;VALUE=DATE-TIME:20220630T160000Z DTEND;VALUE=DATE-TIME:20220630T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/81 DESCRIPTION:Title: Ge odesy of irregular small bodies via neural density fields: geodesyNets \nby Dario Izzo (European Space Agency) as part of Mathematics\, Physics a nd Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe problem of determini ng the density distribution of celestial bodies from the induced gravitati onal pull is of great importance in astrophysics as well as space engineer ing (thinking of situations where spacecraft need to perform orbital and s urface proximity operations). Knowledge of a body density distribution pro vides also great insights on the body's origin and composition. In practic e\, the state-of-the-art approaches for modelling the gravity field of ext ended bodies are spherical harmonics models\, mascon models and polyhedral gravity models. All of these\, however\, while being widely studied and d eveloped since the early works from Laplace\, introduce requirements such as knowledge of a shape model\, assumption of a homogeneous internal densi ty\, being outside the\nBrillouin sphere\, etc...\n\n\nIn this talk\, we i ntroduce and explain Neural Density Fields\, a new approach to represent t he density of extended bodies and learn its accurate form inverting data f rom gravitational accelerations\, orbits or the gravity potential. The res ulting deep learning model\, called geodesyNets is able to compete with c lassical approaches while solving most of their limitations. We also intro duce eclipseNets\, a deep learning model based on related ideas and able t o learn the eclipse shadow cones of irregular bodies\, thus allowing highl y precise propagation and stability studies.\n LOCATION:https://researchseminars.org/talk/MPML/81/ END:VEVENT BEGIN:VEVENT SUMMARY:Audrey Durand (IID\, Université Laval\, Canada) DTSTART;VALUE=DATE-TIME:20220707T160000Z DTEND;VALUE=DATE-TIME:20220707T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/82 DESCRIPTION:Title: In teractive learning for Neurosciences - Between Simulation and Reality\ nby Audrey Durand (IID\, Université Laval\, Canada) as part of Mathematic s\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nLearning a behaviour to conduct a given task can be achieved by interacting with the the environment. This is the crux of reinforcement learning (RL)\, where a n (automated) agent learns to solve a problem through an iterative trial-a nd-error process. More specifically\, an RL agent can interact with the en vironment and learn from these interactions by observing a feedback on the goal task. Therefore\, these methods typically require to be able to inte rvene on the environment and make (possibly a very large number of) mistak es. Although this can be a limiting factor in some applications\, simple R L settings\, such as bandit settings\, can still host a variety of problem s for interactively learning behaviours. In other situations\, simulation might be the key.\n\nIn this talk\, we will show that RL can be used to fo rmulate and tackle data acquisition (imaging) problems in neurosciences. W e will see how bandit methods can be used to optimize super-resolution ima ging by learning on real devices through an actual empirical process. We w ill also see how simulation can be leveraged to learn more sequential deci sion making strategies. These applications highlight the potential of RL t o support expert users on difficult task and enable new discoveries.\n LOCATION:https://researchseminars.org/talk/MPML/82/ END:VEVENT BEGIN:VEVENT SUMMARY:Joseph Bakarji (University of Washington) DTSTART;VALUE=DATE-TIME:20220714T160000Z DTEND;VALUE=DATE-TIME:20220714T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/83 DESCRIPTION:Title: Di mensionally Consistent Learning with Buckingham Pi\nby Joseph Bakarji (University of Washington) as part of Mathematics\, Physics and Machine Le arning (IST\, Lisbon)\n\n\nAbstract\nDimensional analysis is a robust tech nique for extracting insights and finding symmetries in physical systems\, especially when the governing equations are not known. The Buckingham Pi theorem provides a procedure for finding a set of dimensionless groups fro m given measurements\, although this set is not unique. We propose an auto mated approach using the symmetric and self-similar structure of available measurement data to discover the dimensionless groups that best collapse this data to a lower dimensional space according to an optimal fit. We dev elop three data-driven techniques that use the Buckingham Pi theorem as a constraint: (i) a constrained optimization problem with a nonparametric fu nction\, (ii) a deep learning algorithm (BuckiNet) that projects the input parameter space to a lower dimension in the first layer\, and (iii) a spa rse identification of nonlinear dynamics (SINDy) to discover dimensionless equations whose coefficients parameterize the dynamics. I discuss the acc uracy and robustness of these methods when applied to known nonlinear syst ems.\n LOCATION:https://researchseminars.org/talk/MPML/83/ END:VEVENT BEGIN:VEVENT SUMMARY:Inês Hipólito (Humboldt-Universität) DTSTART;VALUE=DATE-TIME:20220908T160000Z DTEND;VALUE=DATE-TIME:20220908T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/84 DESCRIPTION:Title: Th e Free Energy Principle in the Edge of Chaos\nby Inês Hipólito (Humb oldt-Universität) as part of Mathematics\, Physics and Machine Learning ( IST\, Lisbon)\n\n\nAbstract\nLiving beings do an extraordinary thing. By b eing alive they are resisting the second law of thermodynamics. This law s tipulates that open\, living systems tend to dissipation by the increase o f entropy or chaos. From minimal cognitive organisms like plants to more c omplex organisms equipped with nervous systems\, all living systems adjust and adapt to their environments\, thereby resisting the second law. Impre ssively\, while all animals cognitively enact and survive their local envi ronments\, more complex systems do so also by actively constructing their local environments\, thereby not only defying the second law\, but also (e volution) selective properties. Because all living beings defy the second law by adjusting and engaging with the environment\, a prominent question is how do living organisms persist while engaging in adaptive exchanges wi th their complex environments? In this talk I will offer an overview of ho w the Free Energy Principle (FEP) offers a principled solution to this pro blem. The FEP prescribes that living systems maintain themselves by remain ing in non-equilibrium steady states by restricting themselves to a limite d number of states\; it has been widely applied to explain neurocognitive function and embodied action\, develop artificial intelligence and inspire psychopathology models.\n LOCATION:https://researchseminars.org/talk/MPML/84/ END:VEVENT BEGIN:VEVENT SUMMARY:Robert Nowak (University of Wisconsin-Madison) DTSTART;VALUE=DATE-TIME:20221027T160000Z DTEND;VALUE=DATE-TIME:20221027T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/85 DESCRIPTION:Title: Th e Neural Balance Theorem and its Consequences\nby Robert Nowak (Univer sity of Wisconsin-Madison) as part of Mathematics\, Physics and Machine Le arning (IST\, Lisbon)\n\nAbstract: TBA\n LOCATION:https://researchseminars.org/talk/MPML/85/ END:VEVENT BEGIN:VEVENT SUMMARY:Frederico Fiuza (SLAC) DTSTART;VALUE=DATE-TIME:20221103T170000Z DTEND;VALUE=DATE-TIME:20221103T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/86 DESCRIPTION:Title: Ac celerating the understanding of nonlinear dynamical systems using machine learning\nby Frederico Fiuza (SLAC) as part of Mathematics\, Physics a nd Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe description of nonli near\, multi-scale dynamics is a common challenge in a wide range of physi cal systems and research fields — from weather forecast to controlled nu clear fusion. The development of reduced models that balance between accur acy and complexity is critical to advancing theoretical comprehension and enabling holistic computational descriptions of these problems. I will dis cuss how techniques from statistical and machine learning are offering new ways of inferring reduced physics models from the increasingly abundant d ata of nonlinear dynamics produced by experiments\, observations\, and sim ulations. In particular\, I will focus on how sparse regression techniques can be used to infer interpretable plasma physics models (in the form of nonlinear partial differential equations) directly from the data of first- principles fully-kinetic simulations. The potential of this approach is de monstrated by recovering the fundamental hierarchy of plasma physics model s based solely on particle-based simulation data of complex plasma dynamic s. I will discuss how this data-driven methodology provides a promising to ol to accelerate the development of reduced theoretical models of nonlinea r dynamical systems and to design computationally efficient algorithms for multi-scale simulations.\n LOCATION:https://researchseminars.org/talk/MPML/86/ END:VEVENT BEGIN:VEVENT SUMMARY:Markus Reichstein (MPI for Biogeochemistry) DTSTART;VALUE=DATE-TIME:20221124T170000Z DTEND;VALUE=DATE-TIME:20221124T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/87 DESCRIPTION:Title: In tegrating Machine Learning with System Modelling and Observations for a be tter understanding of the Earth System\nby Markus Reichstein (MPI for Biogeochemistry) as part of Mathematics\, Physics and Machine Learning (IS T\, Lisbon)\n\n\nAbstract\nThe Earth is a complex dynamic networked system . Machine learning\, i.e. derivation of computational models from data\, h as already made important contributions to predict and understand componen ts of the Earth system\, specifically in climate\, remote sensing and envi ronmental sciences. For instance\, classifications of land cover types\, p rediction of land-atmosphere and ocean-atmosphere exchange\, or detection of extreme events have greatly benefited from these approaches. Such data- driven information has already changed how Earth system models are evaluat ed and further developed. However\, many studies have not yet sufficiently addressed and exploited dynamic aspects of systems\, such as memory effec ts for prediction and effects of spatial context\, e.g. for classification and change detection. In particular new developments in deep learning off er great potential to overcome these limitations. Yet\, a key challenge an d opportunity is to integrate (physical-biological) system modeling approa ches with machine learning into hybrid modeling approaches\, which combine s physical consistency and machine learning versatility. A couple of examp les are given with focus on the terrestrial biosphere\, where the combinat ion of system-based and machine-learning-based modelling helps our underst anding of aspects of the Earth system.\n LOCATION:https://researchseminars.org/talk/MPML/87/ END:VEVENT BEGIN:VEVENT SUMMARY:Bruno Loureiro (École Polytechnique Fédérale de Lausanne (EPFL) ) DTSTART;VALUE=DATE-TIME:20221215T170000Z DTEND;VALUE=DATE-TIME:20221215T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/88 DESCRIPTION:Title: Ph ase diagram of Stochastic Gradient Descent in high-dimensional two-layer n eural networks\nby Bruno Loureiro (École Polytechnique Fédérale de Lausanne (EPFL)) as part of Mathematics\, Physics and Machine Learning (IS T\, Lisbon)\n\n\nAbstract\nDespite the non-convex optimization landscape\, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks\, which tend to get stuck in badly-generalizing local minima. He re we investigate the cross-over between these two regimes in the high-dim ensional setting\, and in particular investigate the connection between th e so-called mean-field/hydrodynamic regime and the seminal approach of Saa d & Solla. Focusing on the case of Gaussian data\, we study the interplay between the learning rate\, the time scale\, and the number of hidden unit s in the high-dimensional dynamics of stochastic gradient descent (SGD). O ur work builds on a deterministic description of SGD in high-dimensions fr om statistical physics\, which we extend and for which we provide rigorous convergence rates.\n LOCATION:https://researchseminars.org/talk/MPML/88/ END:VEVENT BEGIN:VEVENT SUMMARY:Diogo Gomes (KAUST) DTSTART;VALUE=DATE-TIME:20221014T083000Z DTEND;VALUE=DATE-TIME:20221014T110000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/89 DESCRIPTION:Title: Fr om Calculus of Variations to Reinforcement Learning (Lectures 1 & 2)\n by Diogo Gomes (KAUST) as part of Mathematics\, Physics and Machine Learni ng (IST\, Lisbon)\n\n\nAbstract\nThis course begins with a brief introduct ion to classical calculus of variations and its applications to classical problems such as geodesic trajectories and the brachistochrone problem. Th en\, we examine Hamilton-Jacobi equations\, the role of convexity and the classical verification theorem. Next\, we illustrate the lack of classical solutions and motivate the definition of viscosity solutions. The course ends with a brief description of the reinforcement learning problem and it s connection with Hamilton-Jacobi equations.\n LOCATION:https://researchseminars.org/talk/MPML/89/ END:VEVENT BEGIN:VEVENT SUMMARY:José Miguel Urbano (KAUST) DTSTART;VALUE=DATE-TIME:20221014T133000Z DTEND;VALUE=DATE-TIME:20221014T160000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/90 DESCRIPTION:Title: Se mi-Supervised Learning and the infinite-Laplacian (Lectures 1 & 2)\nby José Miguel Urbano (KAUST) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nMotivated by a recent application i n Semi-Supervised Learning (SSL)\, the minicourse is a brief introduction to the analysis of infinity-harmonic functions. We will discuss the Lipsch itz extension problem\, its solution via MacShane-Whitney extensions and i ts several drawbacks\, leading to the notion of AMLE (Absolutely Minimisin g Lipschitz Extension). We then explore the equivalence between being abso lutely minimising Lipschitz\, enjoying comparison with cones and solving t he infinity-Laplace equation in the viscosity sense.\n LOCATION:https://researchseminars.org/talk/MPML/90/ END:VEVENT BEGIN:VEVENT SUMMARY:João Sacramento (ETH Zürich) DTSTART;VALUE=DATE-TIME:20221110T170000Z DTEND;VALUE=DATE-TIME:20221110T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/91 DESCRIPTION:Title: Th e least-control principle for learning at equilibrium\nby João Sacram ento (ETH Zürich) as part of Mathematics\, Physics and Machine Learning ( IST\, Lisbon)\n\n\nAbstract\nA large number of models of interest in both neuroscience and machine learning can be expressed as dynamical systems at equilibrium. This class of systems includes deep neural networks\, equili brium recurrent neural networks\, and meta-learning. In this talk I will p resent a new principle for learning equilibria with a temporally - and spa tially - local rule. Our principle casts learning as a least-control probl em\, where we first introduce an optimal controller to lead the system tow ards a solution state\, and then define learning as reducing the amount of control needed to reach such a state. We show that incorporating learning signals within a dynamics as an optimal control enables transmitting acti vity-dependent credit assignment information\, avoids storing intermediate states in memory\, and does not rely on infinitesimal learning signals. I n practice\, our principle leads to strong performance matching that of le ading gradient-based learning methods when applied to an array of benchmar king experiments. Our results shed light on how the brain might learn and offer new ways of approaching a broad class of machine learning problems.\ n LOCATION:https://researchseminars.org/talk/MPML/91/ END:VEVENT BEGIN:VEVENT SUMMARY:Tom Goldstein (University of Maryland) DTSTART;VALUE=DATE-TIME:20221117T170000Z DTEND;VALUE=DATE-TIME:20221117T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/92 DESCRIPTION:Title: Bu ilding (and breaking) neural networks that think fast and slow\nby Tom Goldstein (University of Maryland) as part of Mathematics\, Physics and M achine Learning (IST\, Lisbon)\n\n\nAbstract\nMost neural networks are bui lt to solve simple patternmatching tasks\, a process that is often known a s “fast” thinking. In this talk\, I’ll use adversarial methods to ex plore the robustness of neural networks. I’ll also discuss whether vulne rabilities of AI systems that have been observed in academic labs can pose real security threats to industrial systems. Then\, I’ll present method s for constructing neural networks that exhibit “slow” thinking abilit ies akin to human logical reasoning. Rather than learning simple pattern m atching rules\, these networks have the ability to synthesize algorithmic reasoning processes and solve difficult discrete search and planning probl ems that cannot be solved by conventional AI systems. Interestingly\, thes e reasoning systems naturally exhibit error correction and robustness prop erties that make them more difficult to break than their fast thinking cou nterparts.\n LOCATION:https://researchseminars.org/talk/MPML/92/ END:VEVENT BEGIN:VEVENT SUMMARY:Yang-Hui He (London Institute for Mathematical Sciences & Merton C ollege\, Oxford University) DTSTART;VALUE=DATE-TIME:20230202T170000Z DTEND;VALUE=DATE-TIME:20230202T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/93 DESCRIPTION:Title: CO LLOQUIUM: Universes as Bigdata: Physics\, Geometry and Machine-Learning\nby Yang-Hui He (London Institute for Mathematical Sciences & Merton Col lege\, Oxford University) as part of Mathematics\, Physics and Machine Lea rning (IST\, Lisbon)\n\n\nAbstract\nThe search for the Theory of Everythin g has led to superstring theory\, which then led physics\, first to algebr aic/differential geometry/topology\, and then to computational geometry\, and now to data science. With a concrete playground of the geometric lands cape\, accumulated by the collaboration of physicists\, mathematicians and computer scientists over the last 4 decades\, we show how the latest tech niques in machine-learning can help explore problems of interest to theore tical physics and to pure mathematics. At the core of our programme is the question: how can AI help us with mathematics?\n LOCATION:https://researchseminars.org/talk/MPML/93/ END:VEVENT BEGIN:VEVENT SUMMARY:Sebastian Engelke (University of Geneva) DTSTART;VALUE=DATE-TIME:20230112T170000Z DTEND;VALUE=DATE-TIME:20230112T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/94 DESCRIPTION:Title: Ma chine learning beyond the data range: extreme quantile regression\nby Sebastian Engelke (University of Geneva) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nMachine learning method s perform well in prediction tasks within the range of the training data. When interest is in quantiles of the response that go beyond the observed records\, these methods typically break down. Extreme value theory provide s the mathematical foundation for estimation of such extreme quantiles. A common approach is to approximate the exceedances over a high threshold by the generalized Pareto distribution. For conditional extreme quantiles\, one may model the parameters of this distribution as functions of the pred ictors. Up to now\, the existing methods are either not flexible enough or do not generalize well in higher dimensions. We develop new approaches fo r extreme quantile regression that estimate the parameters of the generali zed Pareto distribution with tree-based methods and recurrent neural netwo rks. Our estimators outperform classical machine learning methods and meth ods from extreme value theory in simulations studies. We illustrate how th e recurrent neural network model can be used for effective forecasting of flood risk.\n LOCATION:https://researchseminars.org/talk/MPML/94/ END:VEVENT BEGIN:VEVENT SUMMARY:Alhussein Fawzi (DeepMind) DTSTART;VALUE=DATE-TIME:20230119T170000Z DTEND;VALUE=DATE-TIME:20230119T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/95 DESCRIPTION:Title: Di scovering faster matrix multiplication algorithms with deep reinforcement learning\nby Alhussein Fawzi (DeepMind) as part of Mathematics\, Physi cs and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nImproving the effici ency of algorithms for fundamental computational tasks such as matrix mult iplication can have widespread impact\, as it affects the overall speed of a large amount of computations. The automatic discovery of algorithms usi ng machine learning offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. In this tal k I'll present AlphaTensor\, our reinforcement learning agent based on Alp haZero for discovering efficient and provably correct algorithms for the m ultiplication of arbitrary matrices. AlphaTensor discovered algorithms tha t outperform the state-of-the-art complexity for many matrix sizes. Partic ularly relevant is the case of 4 × 4 matrices in a finite field\, where A lphaTensor's algorithm improves on Strassen's two-level algorithm for the first time since its discovery 50 years ago. I'll present our problem form ulation as a single-player game\, the key ingredients that enable tackling such difficult mathematical problems using reinforcement learning\, and t he flexibility of the AlphaTensor framework.\n LOCATION:https://researchseminars.org/talk/MPML/95/ END:VEVENT BEGIN:VEVENT SUMMARY:Sara A. Solla (Northwestern University) DTSTART;VALUE=DATE-TIME:20230302T170000Z DTEND;VALUE=DATE-TIME:20230302T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/96 DESCRIPTION:Title: Lo w Dimensional Manifolds for Neural Dynamics\nby Sara A. Solla (Northwe stern University) as part of Mathematics\, Physics and Machine Learning (I ST\, Lisbon)\n\n\nAbstract\nThe ability to simultaneously record the activ ity from tens to hundreds to thousands of neurons has allowed us to analyz e the computational role of population activity as opposed to single neuro n activity. Recent work on a variety of cortical areas suggests that neura l function may be built on the activation of population-wide activity patt erns\, the neural modes\, rather than on the independent modulation of ind ividual neural activity. These neural modes\, the dominant covariation pat terns within the neural population\, define a low dimensional neural manif old that captures most of the variance in the recorded neural activity. We refer to the time-dependent activation of the neural modes as their laten t dynamics and argue that latent cortical dynamics within the manifold are the fundamental and stable building blocks of neural population activity. \n LOCATION:https://researchseminars.org/talk/MPML/96/ END:VEVENT BEGIN:VEVENT SUMMARY:Andreas Döpp (Ludwig-Maximilians-Universität München) DTSTART;VALUE=DATE-TIME:20230601T160000Z DTEND;VALUE=DATE-TIME:20230601T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/97 DESCRIPTION:Title: Ma chine-learning strategies in laser-plasma physics\nby Andreas Döpp (L udwig-Maximilians-Universität München) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nThe field of laser-p lasma physics has experienced significant advancements in the past few dec ades\, owing to the increasing power and accessibility of high-power laser s. Initially\, research in this area was limited to single-shot experiment s with minimal exploration of parameters. However\, recent technological a dvancements have enabled the collection of a wealth of data through both e xperimental and simulation-based approaches.
\n\nIn this seminar tal k\, I will present a range of machine learning techniques that we have dev eloped for applications in laser-plasma physics [1]. The first part of my talk will focus on Bayesian optimization\, where I will showcase our lates t findings on multi-objective and multi-fidelity optimization of laser-pla sma accelerators and neural networks [2-4].
\n\nIn the second part o f the talk\, I will discuss machine learning solutions for tackling comple x inverse problems\, such as image deblurring or extracting 3D information from 2D sensors [5-6]. Specifically\, I will discuss various adaptations of established convolutional network architectures\, such as the U-Net\, a s well as novel physics-informed retrieval methods like deep algorithm unr olling. These techniques have shown promising results in overcoming the ch allenges posed by these intricate inverse problems.
\n\nRefe rences:
\n\n[1] Data-driven Science and Machine Learning Me
thods in Laser-Plasma Physics
\nhttps://arxiv.org/abs/2212.00026
[2] Expected hypervol
ume improvement for simultaneous multi-objective and multi-fidelity optimi
zation
\nhttps://arxiv.org
/abs/2112.13901
[3] Multi-objective and multi-fidelity Bayes
ian optimization of laser-plasma acceleration
\nhttps://arxiv.org/abs/2210.03484
[4] P
areto Optimization of a Laser Wakefield Accelerator
\nhttps://arxiv.org/abs/2303.15825
[5] Measuring spatio-temporal couplings using modal spatio-spectral wavef
ront retrieval
\nhttps://a
rxiv.org/abs/2303.01360
[6] Hyperspectral Compressive Wavefr
ont Sensing
\nhttps://arxi
v.org/abs/2303.03555
\;
\n LOCATION:https://researchseminars.org/talk/MPML/97/ END:VEVENT BEGIN:VEVENT SUMMARY:Ben Edelman (Harvard University) DTSTART;VALUE=DATE-TIME:20230209T170000Z DTEND;VALUE=DATE-TIME:20230209T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/98 DESCRIPTION:Title: St udies in feature learning through the lens of sparse boolean functions \nby Ben Edelman (Harvard University) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nHow do deep neural network s learn to construct useful features? Why do self-attention-based networks such as transformers perform so well on combinatorial tasks such as langu age learning? Why do some capabilities of networks emerge "discontinuously " as the computational resources used for training are scaled up? We will present perspectives on these questions through the lens of a particular c lass of simple synthetic tasks: learning sparse boolean functions. In part one\, we will show that the hypothesis class of one-layer transformers ca n learn these functions in a statistically efficient manner. This leads to a view of each layer of a transformer as creating new "variables" out of sparse combinations of the previous layer's outputs. In part two\, we will focus on the classic task of learning sparse parities\, which is statisti cally easy but computationally difficult. We will demonstrate that SGD on various neural networks (transformers\, MLPs\, etc.) successfully learns s parse parities\, with computational efficiency that is close to known lowe r bounds. Moreover\, the training curves display no apparent progress for a long time\, and then quickly drop late in training. We show that despite this apparent delayed breakthrough in performance\, hidden progress is ac tually being made throughout the course of training.\n\nBased on joint wor k with Surbhi Goel\, Sham Kakade\, Cyril Zhang\, Boaz Barak\, and Eran Mal ach:\n\nhttps://arxiv.org/abs/2110.10090\n\nhttps://arxiv.org/abs/2207.087 99\n LOCATION:https://researchseminars.org/talk/MPML/98/ END:VEVENT BEGIN:VEVENT SUMMARY:Valentin De Bortoli (Center for Sciences of Data\, ENS Ulm\, Paris ) DTSTART;VALUE=DATE-TIME:20230316T170000Z DTEND;VALUE=DATE-TIME:20230316T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/99 DESCRIPTION:Title: Di ffusion models\, theory and methodology\nby Valentin De Bortoli (Cente r for Sciences of Data\, ENS Ulm\, Paris) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nGenerative modeling is the task of drawing new samples from an underlying distribution known onl y via an empirical measure. There exists a myriad of models to tackle this problem with applications in image and speech processing\, medical imagin g\, forecasting and protein modeling to cite a few. Among these methods di ffusion models are a new powerful class of generative models that exhibit remarkable empirical performance. They consist of a ``noising'' stage\, wh ereby a diffusion is used to gradually add Gaussian noise to data\, and a generative model\, which entails a ``denoising'' process defined by approx imating the time-reversal of the diffusion. In this talk we discuss three aspects of diffusion models. First\, we will dive into the methodology beh ind diffusion models. Second\, we will present some of their theoretical g uarantees with an emphasis on their behavior under the so-called manifold hypothesis. Such theoretical guarantees are non-vacuous and provide insigh t on the empirical behavior of these models. Finally\, I will present an e xtension of diffusion models to the Optimal Transport setting and introduc e Diffusion Schrodinger Bridges.\n LOCATION:https://researchseminars.org/talk/MPML/99/ END:VEVENT BEGIN:VEVENT SUMMARY:Memming Park (Champalimaud Foundation) DTSTART;VALUE=DATE-TIME:20230323T170000Z DTEND;VALUE=DATE-TIME:20230323T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/100 DESCRIPTION:Title: O n learning signals in recurrent networks\nby Memming Park (Champalimau d Foundation) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nNeural dynamical systems with stable attractor str uctures such as point attractors and continuous attractors are widely hypo thesized to underlie meaningful temporal behavior that requires working me mory. However\, perhaps counterintuitively\, having good working memory is not sufficient for supporting useful learning signals that are necessary to adapt to changes in the temporal structure of the environment. We show that in addition to the well-known continuous attractors\, the periodic an d quasi-periodic attractors are also fundamentally capable of supporting l earning arbitrarily long temporal relationships. Due to the fine tuning pr oblem of the continuous attractors and the lack of\ntemporal fluctuations\ , we believe the less explored quasi-periodic attractors are uniquely qual ified for learning to produce temporally structured behavior. Our theory h as wide implications for the design of artificial learning systems\, and m akes predictions on the observable signatures of biological neural dynamic s that can support temporal dependence learning. Based on our theory\, we developed a new initialization scheme for artificial recurrent neural netw orks which outperforms standard methods for tasks that require learning te mporal dynamics. Finally\, we speculate on their biological implementation s and make predictions on neuronal dynamics.\n LOCATION:https://researchseminars.org/talk/MPML/100/ END:VEVENT BEGIN:VEVENT SUMMARY:Rongjie Lai (Rensselaer Polytechnic Institute) DTSTART;VALUE=DATE-TIME:20230420T160000Z DTEND;VALUE=DATE-TIME:20230420T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/101 DESCRIPTION:Title: L earning Manifold-Structured Data using Deep Neural Networks: Theory and Ap plications\nby Rongjie Lai (Rensselaer Polytechnic Institute) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract \nDeep artificial neural networks have made great success in many problems in science and engineering. In this talk\, I will discuss our recent effo rts to develop DNNs capable of learning non-trivial geometry information h idden in data. In the first part\, I will discuss our work on advocating t he use of a multi-chart latent space for better data representation. Inspi red by differential geometry\, we propose a Chart Auto-Encoder (CAE) and p rove a universal approximation theorem on its representation capability. C AE admits desirable manifold properties that conventional auto-encoders wi th a flat latent space fail to obey. We further establish statistical guar antees on the generalization error for trained CAE models and show their r obustness to noise. Our numerical experiments also demonstrate satisfactor y performance on data with complicated geometry and topology. If time perm its\, I will discuss our work on defining convolution on manifolds via par allel transport. This geometric way of defining parallel transport convolu tion (PTC) provides a natural combination of modeling and learning on mani folds. PTC allows for the construction of compactly supported filters and is also robust to manifold deformations. I will demonstrate its applicatio ns to shape analysis and point clouds processing using PTC-nets. This talk is based on a series of joint work with my students and collaborators.\n LOCATION:https://researchseminars.org/talk/MPML/101/ END:VEVENT BEGIN:VEVENT SUMMARY:Gonçalo Correia (IST and Priberam Labs) DTSTART;VALUE=DATE-TIME:20230309T170000Z DTEND;VALUE=DATE-TIME:20230309T180000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/102 DESCRIPTION:Title: L earnable Sparsity and Weak Supervision for Data-Efficient\, Transparent\, and Compact Neural Models\nby Gonçalo Correia (IST and Priberam Labs) as part of Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\ nAbstract\nNeural network models have become ubiquitous in Machine Learnin g literature. These models are compositions of differentiable building blo cks that result in dense representations of the underlying data. To obtain good representations\, conventional neural models require many training d ata points. Moreover\, those representations\, albeit capable of obtaining a high performance on many tasks\, are largely uninterpretable. These mod els are often overparameterized and give out representations that do not c ompactly represent the data. To address these issues\, we find solutions i n sparsity and various forms of weak supervision. For data-efficiency\, we leverage transfer learning as a form of weak supervision. The proposed mo del can perform similarly to models trained on millions of data points on a sequence-to-sequence generation task\, even though we only train it on a few thousand. For transparency\, we propose a probability normalizing fun ction that can learn its sparsity. The model learns the sparsity it needs differentiably and thus adapts it to the data according to the neural comp onent's role in the overall structure. We show that the proposed model imp roves the interpretability of a popular neural machine translation archite cture when compared to conventional probability normalizing functions. Fin ally\, for compactness\, we uncover a way to obtain exact gradients of dis crete and structured latent variable models efficiently. The discrete node s in these models can compactly represent implicit clusters and structures in the data\, but training them was often complex and prone to failure si nce it required approximations that rely on sampling or relaxations. We pr opose to train these models with exact gradients by parameterizing discret e distributions with sparse functions\, both unstructured and structured. We obtain good performance on three latent variable model applications whi le still achieving the practicality of the approximations mentioned above. Through these novel contributions\, we challenge the conventional wisdom that neural models cannot exhibit data-efficiency\, transparency\, or comp actness.\n LOCATION:https://researchseminars.org/talk/MPML/102/ END:VEVENT BEGIN:VEVENT SUMMARY:Diogo Gomes (KAUST) DTSTART;VALUE=DATE-TIME:20230504T160000Z DTEND;VALUE=DATE-TIME:20230504T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/103 DESCRIPTION:Title: M athematics for data science and AI - curriculum design\, experiences\, and lessons learned\nby Diogo Gomes (KAUST) as part of Mathematics\, Phys ics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nIn this talk\, we w ill explore the importance of mathematical foundations for AI and data sci ence and the design of an academic curriculum for graduate students. While traditional mathematics for AI and data science has focused on core techn iques like linear algebra\, basic probability\, and optimization methods ( e.g.\, gradient and stochastic gradient descent)\, several advanced mathem atical techniques are now essential to understanding modern data science. These include ideas from the calculus of variations in spaces of random va riables\, functional analytic methods\, ergodic theory\, control theory me thods in reinforcement learning\, and metrics in spaces of probability mea sures. We will discuss the author's experience designing an applied mathem atics curriculum on data science and draw on the author's experience and l essons learned in teaching an advanced course on the mathematical foundati ons of data science. This talk aims to promote discussion and exchange of ideas on how mathematicians can play an important role in AI and data scie nce and better equip our students to excel in this field.\n LOCATION:https://researchseminars.org/talk/MPML/103/ END:VEVENT BEGIN:VEVENT SUMMARY:Harry Desmond (University of Portsmouth) DTSTART;VALUE=DATE-TIME:20230511T160000Z DTEND;VALUE=DATE-TIME:20230511T170000Z DTSTAMP;VALUE=DATE-TIME:20241112T134206Z UID:MPML/104 DESCRIPTION:Title: E xhaustive Symbolic Regression (or how to find the best function for your d ata)\nby Harry Desmond (University of Portsmouth) as part of Mathemati cs\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\nSymbolic r egression aims to find optimal functional representation of datasets\, wit h broad applications across science. This is traditionally done using a "g enetic algorithm" which stochastically searches function space using an ev olution-inspired method for generating new trial functions. Motivated by t he uncertainties inherent in this approach -- and its failure on seemingly simple test cases -- I will describe a new method which exhaustively sear ches and evaluates function space. Coupled to a model selection principle based on minimum description length\, Exhaustive Symbolic Regression is gu aranteed to find the simple equations that optimally balance simplicity wi th accuracy on any dataset. I will describe how the method works and showc ase it on Hubble rate measurements and dynamical galaxy data.\n\nBased on work with Deaglan Bartlett and Pedro G. Ferreira:Causal
representation learning (CRL) aims at learning causal factors and their ca
usal relations from high-dimensional observations\, e.g. images. In genera
l\, this is an ill-posed problem\, but under certain assumptions or with t
he help of additional information or interventions\, we are able to guaran
tee that the representations we learn are corresponding to some true under
lying causal factors up to some equivalence class.
\nIn this talk I w
ill first present CITRIS (https://proceedings.ml
r.press/v162/lippe22a/lippe22a.pdf)\, a variational autoencoder framew
ork for causal representation learning from temporal sequences of images\,
in systems in which we can perform interventions. CITRIS exploits tempora
lity and observing intervention targets to identify scalar and multidimens
ional causal factors\, such as 3D rotation angles. In experiments on 3D re
ndered image sequences\, CITRIS outperforms previous methods on recovering
the underlying causal variables. Moreover\, using pretrained autoencoders
\, CITRIS can even generalize to unseen instantiations of causal factors.<
br />\n
\nWhile CRL is an exciting and promising new field of researc
h\, the assumptions required by CITRIS and other current CRL methods can b
e difficult to satisfy in many settings. Moreover\, in many practical case
s learning representations that are not guaranteed to be fully causal\, bu
t exploit some ideas from causality\, can still be extremely useful. As ex
amples\, I will describe some of our work on exploiting these "causality-i
nspired" representations for adapting policies across domains in RL (https://openreview.net/forum?id=8H5bpVwvt5) and to nonstationar
y environments (https://openreview.net/forum?id=VQ9fogN1q6e)\, and how learning a factored graphical representations (even if not n
ecessarily causal) can be beneficial in these (and possibly other) setting
s.