Traveling wave solutions of partial differential equations via neural networks (Q1983171)

Review Report: The investigation of travelling wave solutions of coupled PDEs in recent years has been found in chemical physics, mathematical biology and host of applied sciences. Over the past three decades the problem has been the subject of much interest and become an important area of research. So it is a matter of great significance to investigate the travelling wave solutions and the associated speed, thereby gaining insight into natural and bio-chemical phenomena. In the present paper, the authors propose a novel method for approximating travelling wave solutions via deep learning networks and apply them to three selected problems of current interest, namely the Keller-Segel model of chemotaxis, the Allen-Cahn model of chemical kinetics and the Lotka-Volterra model of population regulation modelled by coupled PDEs, nonlinear in nature. The motivation towards analysing such equations arises from the methodological treatment conducted by other researchers in favour of exploring the aforementioned models for their dynamics and applications. Each of these models is known to have a unique solution and the solutions are widely studied (the authors). The motivation for analysing them using ANN stems from the related previous work from [\textit{H. J. Hwang} et al., J. Comput. Phys. 419, Article ID 109665, 25 p. (2020; Zbl 07507228); \textit{H. Jo} et al., Netw. Heterog. Media 15, No. 2, 247--259 (2020; Zbl 1442.35474); \textit{J. Sirignano} and \textit{K. Spiliopoulos}, J. Comput. Phys. 375, 1339--1364 (2018; Zbl 1416.65394)] and the universal approximation theorem (UAT) of neural networks. Content and structure: The material in this paper is structured as follows. Section 2 is devoted to the introduction of an abstract model that attains travelling wave solutions in the general context and a detailed description of the proposed methodology for finding approximations to the travelling wave solutions and the corresponding speed using the corresponding neural network model. Loss functions using the $L2$ error of the governing equations are defined. Due to the difficulty in imposing the boundary conditions, loss functions are defined appropriately. Similarly to estimate the speed of the wave more accurately, Neumann boundary conditions are added at the boundaries of the truncated interval at which the solution satisfies asymptotically and losses are defined using mean of the limits to overcome the effect of translation. A training procedure consisting of two parts: feed-forward and back-propagation is briefly explained. The optimization process is employed to reduce the total loss created by combining all the losses defined above. A schematic diagram (overall architecture) is presented. The optimization problem can be solved by the gradient descent (GD) algorithm. Partial derivatives of loss functions can be computed easily by automatic differentiation (AD) [\textit{A. Paszke} et al., Automatic differentiation in pytorch. NeurIPS, Autodiff Workshop (2017)] and ADAM an optimizer is employed [\textit{D. P. Kingma} and \textit{J. Ba}, ``Adam: a method for stochastic optimization'', Preprint, \url{arXiv:1412.6980}]. Section 3 discusses the modalities of the deep neural network applied to the KS model for the approximation of travelling wave solutions and the corresponding speed. Primarily, the authors deal with the classical KS model. By imposing the travelling wave ansatz, a coupled ODE is obtained with boundary conditions. For the existence and uniqueness of the solution, a proposition [\textit{T. Li} and \textit{Z.-A. Wang}, Math. Biosci. 240, No. 2, 161--168 (2012; Zbl 1316.92013)] is invoked which gives expression to compute wave speed explicitly. To represent the set of functions that the neural network can approximate, the authors refer to the definition and theorem from [\textit{X. Li}, Neurocomputing 12, No. 4, 327--343 (1996; Zbl 0861.41013)]. As part of the theoretical background, two theorems are stated with proofs which ensure convergence of value of loss function to zero and estimated speed to the correct value. Numerical experiments of the KS model are conducted with sufficiently small to guarantee the existence and uniqueness of solutions and varying model parameters. With few modifications, the extension of the classical model with domain $\mathbb{R}$ to the multi-dimensional model in $\mathbb{R}^n$ is demonstrated choosing $n=4$. Overall it is observed, according to the authors, that the proposed method can be used to approximate travelling wave solutions in higher dimensions. Section 4 deals with an application of the proposed method to the Allen-Cahn model with relaxation. As in the previous section, the original domain, real line, is truncated interval $[-a, a]$ and learning is done within it. Numerical results are obtained choosing $a=200$ and varying model parameters. Experiments are conducted to estimate the width of the interval to obtain a reasonably good approximation of solution for the model under investigation. Section 5 focuses on an application of the proposed method to the LV competition model with two species. The existence and uniqueness of the solution is established in [\textit{Y. Kan-on}, SIAM J. Math. Anal. 26, No. 2, 340--363 (1995; Zbl 0821.34048)]. To the best of the authors' knowledge, the only known fact about speed in this model is its sign. The first experiment is aimed at approximating standing wave (wave front), the only case in which the exact speed is known. Finally, Section 6 corresponds to the conclusion where the area of further research is explored. The difficulty arising due to unboundedness of the domain is addressed by truncating the real line. Moreover, to improve the accuracy of the approximate solution addition of Neumann boundary condition at the end points of the truncated interval is justified. However, there remain some unresolved issues to be addressed forming part of authors' future work. Observations and comments: 1) (O) In the present paper, the authors apply ANNs to physical, chemical and biological phenomena modelled by coupled nonlinear PDEs admitting travelling wave solutions. (C) ANNs provide an ideal representation tool for PDE solutions because they are characterized by adjustable parameters that can be modified by incremental training algorithms. 2) (O) In this paper, the authors show that deep neural networks have powerful function-fitting and approximating capabilities and have great potential in the study of partial differential equations. (C) ANN solutions of PDEs are characterized by other advantages over FDM and FEM solutions that are especially important in non-stationary environments. 3) (O) This paper provides a natural paradigm for solving PDEs via ANNs, because the ANN can be adapted to minimize the appropriately defined loss functions by the governing equations and boundary conditions. (C) Nowadays, there has been a growing number of researchers while using deep learning methods to study partial differential equations. 4) (O) The authors employe the most straightforward and popular optimization algorithm ADAM based on gradient descent. (C) The optimization algorithm determines how the adjustment of the parameters in the neural network takes place. 5) (O) Plots of estimated wave speed and trajectories of the total loss in training epochs for different model parameters of the A-C and L-V models are provided. (C) Accuracy of an ANN model depends on a large number of parameters such as weights, bias, number of hidden layers, different kinds of activation functions and hyper-parameters. Epochs is a form of hyper-parameter which plays an integral part in the training process of a model. 6) (O) Sigmoid/logistic activation function and $tan h$ (tan hyperbolic) functions are used as activation functions. (O) The primary role of the activation function in the ANN model is to transform the summed weighted input from the node into an output value to be fed to the next hidden layer or as output. The main purpose of an activation function is to add non-linearity to the neural network. The nonlinear activation functions are the most used activation functions. Both functions are differentiable and monotonic with ranges $[0,1]$ and $[-1,1]$, respectively. 7) (O) The $L2$-norm loss function, also known as least squares error (LSE), is used to define the total loss function for the selected models. (C) Neural networks are trained using an optimization process that requires a loss function to calculate the model error. A loss function measures the quality of the network's output. 8) (O) LeCun initialization, Xavier uniform initialization are the type of initializations used for numerical experiments. (C) The initialization on biases and weights is a step that can be critical to the model's ultimate performance, and it depends on the choice of activation function. Xavier initialization works with $\tan h$ activations. Sometimes it helps to understand the mathematical justification to grasp the concept. Both aim to express the variance of the weights according to their respective inputs and outputs. 9) (O) The validation and verification of the neural network model of the proposed method including selection of appropriate error metrics, relies on theoretical results derived and proved from the related and present works. (C) Neural network models are data driven and therefore resist analytical or theoretical validation, and therefore must be empirically validated. 10) (O) ANN model developed by the authors is able to approximate travelling wave solutions and corresponding speeds with good precision and fast convergence. (C) Nowadays there has been a growing number of researchers where using deep learning methods study PDEs, systems of PDEs, and coupled nonlinear PDEs. Main contributions: 1) Introduction of an additional loss function to handle infinite domains; 2) Simultaneous approximation of travelling wave solutions and the wave speed of given PDEs; 3) Theoretical evidence to guarantee the wave speed; 4) Providing a unique solution as a correct answer even in cases where uniqueness is not guaranteed; 5) Overcoming the curse of dimensionality. Concluding remarks: Using the method presented in this paper, the authors perform the experimental analysis validating theoretical results of three important partial differential equations. The method proposed in this paper achieves good experimental results due to the powerful function approximation ability of neural networks and the physical, chemical and biological information contained in the NCPDEs (nonlinear coupled partial differential equations). Although the method used in this paper has many advantages, such as not having to consider the discretization of PDEs and eliminating the need of interpolation to cover the entire domain of the problem. However, the method also faces many problems, such as the neural network for solving PDEs relies heavily on training data, which often requires more training time when the quality of the training data is poor. Therefore, it is also important to investigate how to construct high-quality training datasets to reduce the training time. In this paper, authors focus primarily on the study one-dimensional coupled partial differential equations nonlinear in nature, using deep learning ANN. To the best of author's knowledge and belief, the method can be extended to multi-dimensional problems with a few modifications to the proposed method.

0 references

reviewed by

Chandrasekhar Salimath

0 references

zbMATH Keywords

traveling wave solution

0 references

estimation of wave speed

0 references

neural networks

0 references

convergence