Equidistribution-based training of univariate free knot splines, PINNS, and ReLU neural networks
Chris Budd OBE (University of Bath)
| Fri Jul 10, 22:30-23:30 (2 weeks from now) | |
| Lecture held in K9509. |
Abstract: We consider the problem of improving the accuracy, convergence, and conditioning of univariate nonlinear function approximations using (mainly) shallow neural networks (NN) with a rectified linear unit (ReLU) activation function. The standard L_2 based approximation problem is ill-conditioned and the behaviour of the optimisation algorithms used in training these networks degrades rapidly as the width of the network increases. This can lead to significantly poorer approximation in practice than we would expect from the theoretical expressivity of the ReLU NN architecture. Univariate shallow ReLU NNs and traditional approximation methods, such as univariate Free Knot Splines (FKS) span the same function space, and thus have the same theoretical expressivity.
However, the FKS representation, both remains well-conditioned as the number of knots increases, and can be highly accurate if the knots are correctly placed. We leverage the theory of optimal piecewise linear interpolants to improve the training procedure for both a FKS and a ReLU NN. For the FKS we propose a novel two-level training procedure. First solving the nonlinear problem of finding the optimal knot locations of the interpolating FKS using an equidistribution approach. Then solving the nearly linear, well-conditioned, problem of finding the optimal weights and knots of the FKS.
The training of the FKS gives insights into how we can train a ReLU NN effectively to give an equally accurate approximation. To do this we combine the training of the ReLU NN with an equidistribution based loss to find the breakpoints of the ReLU functions, this is then combined with preconditioning the ReLU NN approximation (to take an FKS form) to find the scalings of the ReLU, functions. This procedure leads to a fast, well-conditioned and reliable method of finding an accurate shallow ReLU NN approximation to a univariate target function. This method avoids spectral bias and is highly effective for a wide variety of functions. We test this method on a series of regular, singular, and rapidly varying target functions and obtain good results, realising the expressivity of the shallow ReLU network in all cases. We conclude that in the shallow case to gain full expressivity for the ReLU NN we must both find the optimal breakpoints (by equidistribution) and precondition the problem of finding the optimal coefficients. We then extend our results to more general activation functions, and to deeper networks.
We then apply this methodology to the PINNS and DRM Machine learning methods for solving differential equations, showing that this leads to more accurate and stable schemes.
numerical analysisoptimization and control
Audience: researchers in the topic
( paper )
SFU Mathematics of Computation, Application and Data ("MOCAD") Seminar
| Organizers: | Weiran Sun*, Nilima Nigam |
| *contact for this listing |
