Automatic differentiation of programs with discrete randomness

Moritz Schauer (Chalmers University of Technology & University of Gothenburg)

20-Oct-2022, 13:15-14:00 (18 months ago)

Abstract: Automatic differentiation (AD), a technique for constructing new programs which compute the derivative of an original program, has become ubiquitous throughout scientific computing and deep learning due to the improved performance afforded by gradient-based optimization. However, AD systems have been restricted to the subset of programs that have a continuous dependence on parameters. Programs that have discrete stochastic behaviors governed by distribution parameters, such as flipping a coin with probability p of being heads, pose a challenge to these systems because the connection between the result (heads vs tails) and the parameters (p) is fundamentally discrete. In this paper we develop a new reparameterization-based methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program. We showcase how this method gives an unbiased and low-variance estimator which is as automated as traditional AD mechanisms. We demonstrate unbiased forward-mode AD of discrete-time Markov chains, agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD of a particle filter. Our code is available at github.com/gaurav-arya/StochasticAD.jl

machine learningprobabilitystatistics theory

Audience: researchers in the topic

( paper )


Gothenburg statistics seminar

Series comments: Gothenburg statistics seminar is open to the interested public, everybody is welcome. It usually takes place in MVL14 (http://maps.chalmers.se/#05137ad7-4d34-45e2-9d14-7f970517e2b60, see specific talk).

Organizers: Moritz Schauer*, Ottmar Cronie*
*contact for this listing

Export talk to