Template Numerical Library version develop:1f26cbe9a
ODE solvers tutorial

Introduction

In this part, we describe solvers of ordinary differential equations having the following form:

\[ \frac{d \vec u}{dt} = \vec f( t, \vec u) \text{ on } (0,T), \]

\[ \vec u( 0 ) = \vec u_{ini}. \]

TNL offers the following ODE solvers:

  1. TNL::Solvers::ODE::Euler - the Euler method with the 1-st order of accuracy.
  2. TNL::Solvers::ODE::Merson - the Runge-Kutta-Merson solver with the 4-th order of accuracy and adaptive choice of the time step.

Each solver has its static counterpart which can be run even in the GPU kernels which means that it can be combined with TNL::Algorithms::ParallelFor for example. The static ODE solvers are the following:

  1. TNL::Solvers::ODE::StaticEuler - the Euler method with the 1-st order of accuracy.
  2. TNL::Solvers::ODE::StaticMerson - the Runge-Kutta-Merson solver with the 4-th order of accuracy and adaptive choice of the time step.

Static ODE solvers

Static solvers are supposed to be used mainly when \( x \in R \) is scalar or \( \vec x \in R^n \) is vector where \( n \) is small. Firstly, we show example of scalar problem of the following form:

\[ \frac{d u}{dt} = t \sin ( t )\ \rm{ on }\ (0,T), \]

\[ u( 0 ) = 0. \]

This problem can be solved as follows:

1#include <iostream>
2#include <TNL/Solvers/ODE/StaticEuler.h>
3
4using Real = double;
5
6int main( int argc, char* argv[] )
7{
9 const Real final_t = 10.0;
10 const Real tau = 0.001;
11 const Real output_time_step = 0.25;
12
13 ODESolver solver;
14 solver.setTau( tau );
15 solver.setTime( 0.0 );
16 Real u = 0.0;
17 while( solver.getTime() < final_t )
18 {
19 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
20 auto f = [] ( const Real& t, const Real& tau, const Real& u, Real& fu ) {
21 fu = t * sin( t );
22 };
23 solver.solve( u, f );
24 std::cout << solver.getTime() << " " << u << std::endl;
25 }
26}
Solver of ODEs with the first order of accuracy.
Definition: StaticEuler.h:56
__cuda_callable__ void setTau(const RealType &tau)
Setter of the time step used for the computation.
Definition: StaticExplicitSolver.hpp:52
T endl(T... args)
constexpr ResultType min(const T1 &a, const T2 &b)
This function returns minimum of two numbers.
Definition: Math.h:30
T sin(T... args)

We first define the type Real representing the floating-point arithmetics, which is double in this case (line 4). In the main function, we define the type of the ODE solver ( ODESolver, line 8). We choose TNL::Solvers::ODE::StaticEuler. We define the variable final_t (line 9) representing the size of the time interval \( (0,T)\), next we define the integration time step tau (line 10) and the variable output_time_step (line 11) representing checkpoints in which we will print value of the solution \( x(t)\). On the line 13, we create an instance of the ODESolver and set the integration time step (line 14) and the initial time of the solver (line 15). Next we create variable u representing the solution of the given ODE and we initiate it with the initial condition \( u(0) = 0\) (line 16). On the lines 17-25, we iterate over the interval \( (0,T) \) with step given by the frequency of the checkpoints given by the variable output_time_steps. On the line 19, we let the solver to iterate until the next checkpoint or the end of the interval \((0,T) \) depending on what occurs first (it is expressed by TNL::min( solver.getTime() + output_time_step, final_t )). On the lines 20-22, we define the lambda function f representing the right-hand side of the ODE being solved. The lambda function receives the following arguments:

  • t is the current value of the time variable \( t \in (0,T)\),
  • tau is the current integration time step,
  • u is the current value of the solution \( u(t)\),
  • fu is a reference on a variable into which we evaluate the right-hand side \( f(u,t) \) on the ODE.

The lambda function is supposed to compute just the value of fu. It is t * sin(t) on our case. Finally we call the ODE solver (line 23). As parameters, we pass the variable u representing the solution \( u(t)\) and a lambda function representing the right-hand side of the ODE. On the line 23, we print values of the solution at given checkpoints. The result looks as follows:

0.001 0
0.002 1e-09
0.252 0.00526916
0.502 0.0409948
0.752 0.13364
1.002 0.302432
1.252 0.556612
1.502 0.893635
1.752 1.2985
2.002 1.74432
2.252 2.19409
2.502 2.60357
2.752 2.92506
3.002 3.11173
3.252 3.1222
3.502 2.92497
3.752 2.50232
4.002 1.85323
4.252 0.995174
4.502 -0.0355494
4.752 -1.18502
5.002 -2.38443
5.252 -3.55415
5.502 -4.60904
5.752 -5.46451
6.002 -6.04295
6.252 -6.28004
6.502 -6.1306
6.752 -5.57319
7.002 -4.61342
7.252 -3.28541
7.502 -1.65121
7.752 0.201757
8.002 2.16523
8.252 4.11644
8.502 5.92576
8.752 7.46532
9.002 8.61785
9.252 9.28537
9.502 9.3969
9.752 8.9147
10 7.84941

Such data can by visualized using Gnuplot as follows

plot 'StaticODESolver-SineExample.out' with lines

or it can be processed by the following Python script which draws graph of the function \( u(t) \) using Matplotlib.

1#!/usr/bin/env python3
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6
7
9plt.rcParams['text.usetex'] = True
10
11
13f = open( sys.argv[1], 'r' )
14x_lst = []
15y_lst = []
16for line in f:
17 line = line.strip()
18 a = line.split()
19 x_lst.append( float( a[ 0 ] ) )
20 y_lst.append( float( a[ 1 ] ) )
21
22
24x = np.array(x_lst)
25y = np.array(y_lst)
26
27
29fig, ax = plt.subplots()
30ax.set_xlim( [0,10] )
31ax.set_ylim( [-10,10] )
32ax.plot(x, y, linewidth=2.0)
33ax.set_xlabel( "$t$" )
34ax.set_ylabel( "$u(t)$" )
35plt.savefig( sys.argv[2] )
36plt.close(fig)

We first parse the input file (lines 13-20) and convert the data into NumPy arrays (lines 24-25). Finaly, these arrays are drawn into a graph (lines 29-36). The graph of the solution looks as follows:

In the next example, we demonstrate use of the static ODE solver to solve a system of ODEs, namely the Lorenz system which reads as:

\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T) \]

\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T) \]

\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T) \]

\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini} \]

for given constants \( \sigma, \rho \) and \( \beta \). The solution \( \vec u(t) = (x(t), y(t), z(t)) \in R^3 \) is represented by three-dimensional static vector ( TNL::Containers::StaticVector). The solver looks as follows:

1#include <iostream>
2#include <TNL/Containers/StaticVector.h>
3#include <TNL/Solvers/ODE/StaticEuler.h>
4
5using Real = double;
6
7int main( int argc, char* argv[] )
8{
11 const Real final_t = 25.0;
12 const Real tau = 0.001;
13 const Real output_time_step = 0.01;
14 const Real sigma = 10.0;
15 const Real rho = 28.0;
16 const Real beta = 8.0 / 3.0;
17
18 ODESolver solver;
19 solver.setTau( tau );
20 solver.setTime( 0.0 );
21 Vector u( 1.0, 2.0, 3.0 );
22 while( solver.getTime() < final_t )
23 {
24 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
25 auto f = [=] ( const Real& t, const Real& tau, const Vector& u, Vector& fu ) {
26 const Real& x = u[ 0 ];
27 const Real& y = u[ 1 ];
28 const Real& z = u[ 2 ];
29 fu[ 0 ] = sigma * (y - x );
30 fu[ 1 ] = rho * x - y - x * z;
31 fu[ 2 ] = -beta * z + x * y;
32 };
33 solver.solve( u, f );
34 std::cout << u[ 0 ] << " " << u[ 1 ] << " " << u[ 2 ] << std::endl;
35 }
36}

The code is very similar to the previous example. There are the following differences:

  1. We define the type of the variable u representing the solution \( \vec u(t) \) as TNL::Containers::StaticVector< 3, Real > (line 9) which is reflected even in the definition of the ODE solver (TNL::Solvers::ODE::StaticEuler< Vector > , line 10) and the variable u (line 21).
  2. In addition to the parameters of the solver ( final_t, tau and output_time_step, lines 14-16) we define parameters of the Lorenz system (sigma, rho and beta, lines 14-16).
  3. The initial condition \( \vec u(0) = (1,2,3) \) is set on the line 21.
  4. In the lambda function representing the right-hand side of the Lorenz system (lines 25-32), we first define auxiliary aliases x ,y and z (lines 26-28) to make the code easier to read. The main right-hand side of the Lorenz system is implemented on the lines (29-31).
  5. In the line 3, we print all components of the vector u.

The solver creates file with the solution \( (\sigma(i \tau), \rho( i \tau), \beta( i \tau )) \) for ' \( i = 0, 1, \ldots N \) on separate lines. It looks as follows:

sigma[ 0 ] rho[ 0 ] beta[ 0 ]
sigma[ 1 ] rho[ 1 ] beta[ 1 ]
sigma[ 2 ] rho[ 2 ] beta[ 2 ]
...

Such file can visualized using Gnuplot interactively in 3D as follows

splot 'StaticODESolver-LorenzExample.out' with lines

or it can be processed by the following Python script:

1#!/usr/bin/env python3
2
3import sys
4import matplotlib.pyplot as plt
5from mpl_toolkits.mplot3d import axes3d, Axes3D
6import numpy as np
7
8
10plt.rcParams['text.usetex'] = True
11
12
14f = open( sys.argv[1], 'r' )
15x_lst = []
16y_lst = []
17z_lst = []
18for line in f:
19 line = line.strip()
20 a = line.split()
21 x_lst.append( float( a[ 0 ] ) )
22 y_lst.append( float( a[ 1 ] ) )
23 z_lst.append( float( a[ 2 ] ) )
24
25
27x = np.array(x_lst)
28y = np.array(y_lst)
29z = np.array(z_lst)
30
31
33fig = plt.figure()
34ax = Axes3D(fig)
35theta = np.linspace(-4 * np.pi, 4 * np.pi, 100)
36ax.plot(x, y, z, label='Lorenz attractor')
37ax.legend()
38plt.savefig( sys.argv[2] )
39plt.close(fig)

The script has very similar structure as in the previous example. The result looks as follows:

Combining static ODE solvers with parallel for

The static solvers can be used inside of lambda functions for TNL::Algorithms::ParallelFor for example. This can be useful when we need to solve large number of independent ODE problems, for example for parametric analysis. We demonstrate it on the two examples we have described above.

Solving scalar problems in parallel

The first example solves ODE given by

\[ \frac{d u}{dt} = t \sin ( c t )\ \rm{ on }\ (0,T), \]

\[ u( 0 ) = 0, \]

where \( c \) is a constant. We will solve it in parallel ODEs with different values \( c \in \langle c_{min}, c_{max} \rangle \). The exact solution can be found here. The code reads as follows:

1#include <iostream>
2#include <fstream>
3#include <TNL/Solvers/ODE/StaticEuler.h>
4#include <TNL/Containers/Vector.h>
5#include <TNL/Algorithms/ParallelFor.h>
6
7using Real = double;
8
9template< typename Device >
10void solveParallelODEs( const char* file_name )
11{
14 const Real final_t = 10.0;
15 const Real tau = 0.001;
16 const Real output_time_step = 0.1;
17 const Real c_min = 1.0;
18 const Real c_max = 5.0;
19 const int c_vals = 5.0;
20 const Real c_step = ( c_max - c_min ) / ( c_vals - 1 );
21 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
22
23 Vector results( output_time_steps * c_vals, 0.0 );
24 auto results_view = results.getView();
25 auto f = [=] __cuda_callable__ ( const Real& t, const Real& tau, const Real& u, Real& fu, const Real& c ) {
26 fu = t * sin( c * t );
27 };
28 auto solve = [=] __cuda_callable__ ( int idx ) mutable {
29 const Real c = c_min + idx * c_step;
30 ODESolver solver;
31 solver.setTau( tau );
32 solver.setTime( 0.0 );
33 Real u = 0.0;
34 int time_step( 1 );
35 results_view[ idx ] = u;
36 while( time_step < output_time_steps )
37 {
38 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
39 solver.solve( u, f, c );
40 results_view[ time_step++ * c_vals + idx ] = u;
41 }
42 };
44
45 std::fstream file;
46 file.open( file_name, std::ios::out );
47 for( int i = 0; i < c_vals; i++ )
48 {
49 file << "# c = " << c_min + i * c_step << std::endl;
50 for( int k = 0; k < output_time_steps;k++ )
51 file << k * output_time_step << " " << results.getElement( k * c_vals + i ) << std::endl;
52 file << std::endl;
53 }
54}
55
56int main( int argc, char* argv[] )
57{
58 TNL::String file_name( argv[ 1 ] );
59 file_name += "/StaticODESolver-SineParallelExample-result.out";
60 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
61#ifdef HAVE_CUDA
62 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
63#endif
64}
T ceil(T... args)
Vector extends Array with algebraic operations.
Definition: Vector.h:40
Class for managing strings.
Definition: String.h:33
T open(T... args)
static void exec(Index start, Index end, Function f, FunctionArgs... args)
Static method for the execution of the loop.
Definition: ParallelFor.h:87

In this example we also show, how to run it on GPU. Therefore we moved the main solver to separate function solveParallelODEs which has one template parameter Device telling on what device it is supposed to run. The results of particular ODEs are stored in a memory and at the end they are copied to a file with given filename file_name. The variable \( u \) is scalar therefore we represent it by the type Real in the solver (line 12). Next we define parameters of the ODE solver (final_t, tau and output_time_step, lines 14-16), interval for the parameter of the ODE \( c \in \langle c_{min}, c_{max} \rangle \) ( c_min, c_max, lines 17-18) and number of values c_vals (line 19) distributed equidistantly in the interval with step c_step (line 20). We use the number of different values of the parameter c as a range for the parallel for on the line 43. This parallel for processes the lambda function solve which is defined on the lines 28-42. It receives a parameter idx which is index of the value of the parameter c. We compute its value on the line 29. Next we create the ODE solver (line 30) and setup its parameters (lines 31-32). We set initial condition of the given ODE and we define variable time_step which counts checkpoints which we store in the memory in vector results allocated in the line 23 and accessed in the lambda function via vector view results_view (defined on the line 24). We iterate over the interval \( (0, T) \) in the while loop starting on the line 36. We set the stop time of the ODE solver (line 38) and we run the solver (line 39). Finally we store the result at given checkpoint into vector view results_view. If the solver runs on the GPU, it cannot write the checkpoints into a file. This is done in postprocessing in the lines 45-53.

Note, how we pass the value of parameter c to the lambda function f. The method solve of the ODE solvers(TNL::Solvers::ODE::StaticEuler::solve, for example) accepts user defined parameters via variadic templates. It means that in addition to the variable u and the right-hand side f we can add any other parameters like c in this example (line 39). This parameter appears in the lambda function f (line 25). The reason for this is that the nvcc compiler (version 10.1) does not accept lambda function defined within another lambda function. If such a construction is accepted by a compiler, the lambda function f which can be defined within the lambda function solve and the variable c defined in the lambda function solve could be captured by f.

The solver generates file of the following format:

# c = c[ 0 ]
x[ 0 ] u( c[ 0 ], x[ 0 ] )
x[ 1 ] u( c[ 0 ], x[ 1 ] )
....
# c = c[ 1 ]
x[ 0 ] u( c[ 1 ], x[ 0 ] )
x[ 1 ] u( c[ 1 ], x[ 1 ] )
...

The file an visualized using Gnuplot as follows

splot 'StaticODESolver-SineParallelExample-result.out' with lines

or it can be processed by the following Python script:

1#!/usr/bin/env python3
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6
7
9plt.rcParams['text.usetex'] = True
10
11
13f = open( sys.argv[1], 'r' )
14current_c = 0.0
15x_lst = []
16u_lst = []
17x_data = []
18u_data = []
19parameters = []
20
21for line in f:
22 line = line.strip()
23 a = line.split()
24 if not a:
25 continue
26 if a[ 0 ] == "#":
27 if x_lst:
28 parameters.append( current_c )
29 u_data.append( np.array( u_lst ) )
30 if not x_data:
31 x_data.append( np.array( x_lst ) )
32 u_lst.clear()
33
34 current_c = float( a [ 3 ] )
35 else:
36 if not x_data:
37 x_lst.append( float( a[ 0 ] ) )
38 u_lst.append( float( a[ 1 ] ) )
39parameters.append( current_c )
40u_data.append( np.array( u_lst ) )
41
42
44n = len( parameters )
45fig, ax = plt.subplots( 1, n, figsize=(15, 3), sharey=True )
46idx = 0
47for u in u_data:
48 ax[ idx ].plot( x_data[0], u, linewidth=2.0, label=f"$c={parameters[idx]}$" )
49 ax[ idx ].set_xlabel( "t" )
50 ax[ idx ].set_ylabel( "u(t)" )
51 idx = idx + 1
52plt.savefig( sys.argv[2] )
53plt.close(fig)
54
55
56
57

The result of this example looks as follows:

Solving the Lorenz system in parallel

The second example is a parametric analysis of the Lorenz model

\[ \frac{dx}{dt} = \sigma( x - y),\ \rm{ on }\ (0,T) \]

\[ \frac{dy}{dt} = x(\rho - z ) - y,\ \rm{ on }\ (0,T) \]

\[ \frac{dz}{dt} = xy - \beta z,\ \rm{ on }\ (0,T) \]

\[ \vec u(0) = (x(0),y(0),z(0)) = \vec u_{ini} \]

which we solve for different values of the model parameters

\[ \sigma_i = \sigma_{min} + i \Delta \sigma, \]

\[ \rho_j = \rho_{min} + j \Delta \rho, \]

\[ \beta_k = \beta_{min} + k \Delta \beta, \]

where we set \( \Delta \sigma = \Delta \rho = \Delta \beta = l / (p-1) \) and where \( i,j,k = 0, 1, \ldots, p - 1 \). The code of the solver looks as follows:

1#include <iostream>
2#include <fstream>
3#include <TNL/Solvers/ODE/StaticEuler.h>
4#include <TNL/Containers/Vector.h>
5#include <TNL/Algorithms/ParallelFor.h>
6
7using Real = double;
8
9template< typename Device >
10void solveParallelODEs( const char* file_name )
11{
14 const Real final_t = 50.0;
15 const Real tau = 0.001;
16 const Real output_time_step = 0.005;
17 const Real sigma_min( 10.0 ), rho_min( 15.0 ), beta_min( 1.0 );
18 const int parametric_steps = 4;
19 const Real sigma_step = 30.0 / ( parametric_steps - 1 );
20 const Real rho_step = 21.0 / ( parametric_steps - 1 );
21 const Real beta_step = 15.0 / ( parametric_steps - 1 );
22 const int output_time_steps = ceil( final_t / output_time_step ) + 1;
23
24 const int results_size( output_time_steps * parametric_steps * parametric_steps * parametric_steps );
25 TNL::Containers::Vector< Vector, Device > results( results_size, 0.0 );
26 auto results_view = results.getView();
27 auto f = [=] __cuda_callable__ ( const Real& t, const Real& tau, const Vector& u, Vector& fu,
28 const Real& sigma_i, const Real& rho_j, const Real& beta_k ) {
29 const Real& x = u[ 0 ];
30 const Real& y = u[ 1 ];
31 const Real& z = u[ 2 ];
32 fu[ 0 ] = sigma_i * (y - x );
33 fu[ 1 ] = rho_j * x - y - x * z;
34 fu[ 2 ] = -beta_k * z + x * y;
35 };
36 auto solve = [=] __cuda_callable__ ( int i, int j, int k ) mutable {
37 const Real sigma_i = sigma_min + i * sigma_step;
38 const Real rho_j = rho_min + j * rho_step;
39 const Real beta_k = beta_min + k * beta_step;
40
41 ODESolver solver;
42 solver.setTau( tau );
43 solver.setTime( 0.0 );
44 Vector u( 1.0, 1.0, 1.0 );
45 int time_step( 1 );
46 results_view[ ( i * parametric_steps + j ) * parametric_steps + k ] = u;
47 while( time_step < output_time_steps )
48 {
49 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
50 solver.solve( u, f, sigma_i, rho_j, beta_k );
51 const int idx = ( ( time_step++ * parametric_steps + i ) * parametric_steps + j ) * parametric_steps + k;
52 results_view[ idx ] = u;
53 }
54 };
55 TNL::Algorithms::ParallelFor3D< Device >::exec( 0, 0, 0, parametric_steps, parametric_steps, parametric_steps, solve );
56
57 std::fstream file;
58 file.open( file_name, std::ios::out );
59 for( int sigma_idx = 0; sigma_idx < parametric_steps; sigma_idx++ )
60 for( int rho_idx = 0; rho_idx < parametric_steps; rho_idx++ )
61 for( int beta_idx = 0; beta_idx < parametric_steps; beta_idx++ )
62 {
63 Real sigma = sigma_min + sigma_idx * sigma_step;
64 Real rho = rho_min + rho_idx * rho_step;
65 Real beta = beta_min + beta_idx * beta_step;
66 file << "# sigma " << sigma << " rho " << rho << " beta " << beta << std::endl;
67 for( int i = 0; i < output_time_steps - 1; i++ )
68 {
69 int offset = ( ( i * parametric_steps + sigma_idx ) * parametric_steps + rho_idx ) * parametric_steps + beta_idx;
70 auto u = results.getElement( offset );
71 file << u[ 0 ] << " " << u[ 1 ] << " " << u[ 2 ] << std::endl;
72 }
73 file << std::endl;
74 }
75}
76
77int main( int argc, char* argv[] )
78{
79 TNL::String file_name( argv[ 1 ] );
80 file_name += "/StaticODESolver-LorenzParallelExample-result.out";
81
82 solveParallelODEs< TNL::Devices::Host >( file_name.getString() );
83#ifdef HAVE_CUDA
84 solveParallelODEs< TNL::Devices::Cuda >( file_name.getString() );
85#endif
86}
static void exec(Index startX, Index startY, Index startZ, Index endX, Index endY, Index endZ, Function f, FunctionArgs... args)
Static method for the execution of the loop.
Definition: ParallelFor.h:190

It is very similar to the previous one. There are just the following changes:

  1. On the line 17, we define minimal values for the parameters \( \sigma, \beta \) and \( \rho \). On the line 18, we define how many different values we will consider for each parameter. The size of equidistant steps in the parameter variations is defined on the line 19. The interval for parameters variations is set to \( [0,30] \) (line 19 as well).
  2. On the line 23, we allocate vector results into which we will store the solution of the Lorenz problem for various parameters.
  3. Next we define the lambda function f representing the right-hand side of the Lorenz problem (lines 25-33) and the lambda function solve representing the ODE solver for the Lorenz problem with particular setup of the parameters (lines 34-52). This lambda function is processed by TNL::Algorithms::ParallelFor3D called on the line 53. Therefore the lambda function solve receives three indexes i, j and k which are used to compute particular values of the parameters \( \sigma_i, \rho_j, \beta_k \) which are represented by variables sigma_i, rho_j and beta_k (lines 35-37). These parameters must be passed to the lambda function f explicitly (line 48). The reason is the same as in the previous example - nvcc (version 10.1) does not accept a lambda function defined within another lambda function.
  4. The initial condition for the Lorenz problem is set to vector \( (1,1,1) \) (line 42). Finally, we start the time loop (lines 45-51) and we store the state of the solution into the vector results using related vector view results_view in the time intervals given by the variable output_time_step.
  5. When all ODEs ares solved, we copy all the solutions from the vector results into an output file.

The files has the following format:

# sigma = c[ 0 ] rho = rho[ 0 ] beta = beta[ 0 ]
x[ 0 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 0 ] )
x[ 1 ] u( sigma[ 0 ], rho[ 0 ], beta[ 0 ], x[ 1 ] )
....
# sigma = c[ 1 ] rho = rho[ 1 ] beta = beta[ 1 ]
x[ 0 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 0 ] )
x[ 1 ] u( sigma[ 1 ], rho[ 1 ], beta[ 1 ], x[ 1 ] )
...

The file can be processed by the following Python script:

1#!/usr/bin/env python3
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6from mpl_toolkits.mplot3d import axes3d, Axes3D
7
8plt.rcParams['text.usetex'] = True
9
10
12f = open( sys.argv[1], 'r' )
13current_sigma = 0.0
14current_rho = 0.0
15current_beta = 0.0
16sigma_lst = []
17rho_lst = []
18beta_lst = []
19x_lst = []
20y_lst = []
21z_lst = []
22x_data = []
23y_data = []
24z_data = []
25parameters = []
26data = {}
27size = 0
28for line in f:
29 line = line.strip()
30 a = line.split()
31 if not a:
32 continue
33 if a[ 0 ] == "#":
34 if x_lst:
35 parameters_tuple = ( current_sigma, current_rho, current_beta )
36 parameters.append( parameters_tuple )
37 data_tuple = ( np.array( x_lst ), np.array( y_lst ), np.array( z_lst ) )
38 data[ parameters_tuple ] = data_tuple
39 x_lst.clear()
40 y_lst.clear()
41 z_lst.clear()
42
43 current_sigma = float( a[ 2 ] )
44 current_rho = float( a[ 4 ] )
45 current_beta = float( a [ 6 ] )
46 if current_sigma not in sigma_lst:
47 sigma_lst.append( current_sigma )
48 if current_rho not in rho_lst:
49 rho_lst.append( current_rho )
50 if current_beta not in beta_lst:
51 beta_lst.append( current_beta )
52 else:
53 x_lst.append( float( a[ 0 ] ) )
54 y_lst.append( float( a[ 1 ] ) )
55 z_lst.append( float( a[ 2 ] ) )
56parameters_tuple = ( current_sigma, current_rho, current_beta )
57parameters.append( parameters_tuple )
58data_tuple = ( np.array( x_lst ), np.array( y_lst ), np.array( z_lst ) )
59data[ parameters_tuple ] = data_tuple
60
61
63sigma_n = len( sigma_lst )
64sigma_idx = 1
65for sigma in sigma_lst:
66 rho_n = len( rho_lst )
67 beta_n = len( beta_lst )
68
69 fig, ax = plt.subplots( rho_n, beta_n, figsize=(8, 8), sharey=True, sharex=True )
70 fig.suptitle( f'$\sigma={sigma}$')
71 #ax = Axes3D(fig) does not work with ax indexing
72 rho_idx = 0
73 beta_idx = 0
74 for rho in rho_lst:
75 for beta in beta_lst:
76 parameters_tuple = ( sigma, rho, beta )
77 data_tuple = data[ parameters_tuple ]
78 ax[ rho_idx, beta_idx ].plot( data_tuple[ 1 ], data_tuple[ 2 ], linewidth=1.0 )
79 if beta_idx == 0:
80 ax[ rho_idx, beta_idx ].set_ylabel( f'$\\rho={rho}$' )
81 if rho_idx == rho_n-1:
82 ax[ rho_idx, beta_idx ].set_xlabel( f'$\\beta={beta}$' )
83 beta_idx = beta_idx + 1
84 beta_idx = 0
85 rho_idx = rho_idx + 1
86
87 plt.savefig( f"{sys.argv[2]}-{sigma_idx}.png" )
88 sigma_idx = sigma_idx + 1
89 plt.close(fig)
90
91
92
93

The result looks as follows:

Non-static ODE Solvers

In this section, we will show how to solve simple heat equation in 1D. The heat equation is a parabolic partial differential equation which reads as follows:

\[ \frac{\partial u(t,x)}{\partial t} - \frac{\partial^2 u(t,x)}{\partial^2 x} = 0\ \rm{on}\ (0,T) \times (0,1), \]

\[ u(t,0) = 0, \]

\[ u(t,0) = 1, \]

\[ u(0,x) = u_{ini}(x)\ \rm{on}\ [0,1]. \]

We discretize it by the finite difference method to get numerical approximation. We first define set of nodes \( x_i = ih \) for \(i=0,\ldots n-1 \) where \(h = 1 / (n-1) \) (we use C++ indexing for the consistency). Using the method of lines and approximating the second derivative by the central finite diference ( \( \frac{\partial^2 u(t,x)}{\partial^2 x} \approx \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2} \)), we obtain system of ODEs of the following form

\[ \frac{\rm{d}u_i(t)}{\rm{d} t} = \frac{u_{i-1} - 2 u_i + u_{i+1}}{h^2}\ \rm{for}\ i = 1, \ldots, n-2, \]

where \( u_i(t) = u(t,ih) \) and \( h \) space step between two adjacent nodes of the numerical mesh. We also set

\[ u_0(t) = u_{n-1}(t) = 0\ \rm{on}\ [0,T]. \]

What are the main differences compared to the Lorenz model?

  1. The size of the Lorenz model is fixed. It is equal to three since \( (\sigma, \rho, \beta) \in R^3 \) which is small vector of fixed size and it can be represented by the static vector TNL::Containers::StaticVector< 3, Real >. On the other hand, the size of the ODE system arising in the discretization by the method of lines depends not on the problem we solve but on the desired accuracy - the larger \( n \) the more accurate numerical approximation we get. The number of nodes \( n \) used for the space discretisation defines the number of parameters defining the mesh function. These parameters are also referred as degrees of freedom, DOFs. Therefore the size of the system can be large and it is better to employ dynamic vector TNL::Containers::Vector for the solution.
  2. The size of the Lorenz model is small and so the evaluation of its right-hand side can be done sequentially by one thread. The size of the ODE system can be very large and evaluating the right-hand side asks for being performed in parallel.
  3. The dynamic vector TNL::Containers::Vector allocates data dynamically and therefore it cannot be created within a GPU kernel which means the ODE solvers cannot be created in the GPU kernel either. For this reason, the lambda function f evaluating the right-hand side of the ODE system is always executed on the host and it calls TNL::Algorithms::ParallelFor to evaluate the right-hand side of the ODE system.

Basic setup

The implementation of the solver looks as follows:

1#include <iostream>
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/Euler.h>
5#include "write.h"
6
7using Real = double;
8using Index = int;
9
10template< typename Device >
11void solveHeatEquation( const char* file_name )
12{
14 using VectorView = typename Vector::ViewType;
15 using ODESolver = TNL::Solvers::ODE::Euler< Vector >;
16
17 /***
18 * Parameters of the discretisation
19 */
20 const Real final_t = 0.05;
21 const Real output_time_step = 0.005;
22 const Index n = 41;
23 const Real h = 1.0 / ( n - 1 );
24 const Real tau = 0.1 * h * h;
25 const Real h_sqr_inv = 1.0 / ( h * h );
26
27 /***
28 * Initial condition
29 */
30 Vector u( n );
31 u.forAllElements( [=] __cuda_callable__ ( Index i, Real& value ) {
32 const Real x = i * h;
33 if( x >= 0.4 && x <= 0.6 )
34 value = 1.0;
35 else value = 0.0;
36 } );
37 std::fstream file;
38 file.open( file_name, std::ios::out );
39 write( file, u, n, h, ( Real ) 0.0 );
40
41 /***
42 * Setup of the solver
43 */
44 ODESolver solver;
45 solver.setTau( tau );
46 solver.setTime( 0.0 );
47
48 /***
49 * Time loop
50 */
51 Index output_idx( 1 );
52 while( solver.getTime() < final_t )
53 {
54 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
55 auto f = [=] __cuda_callable__ ( Index i, const VectorView& u, VectorView& fu ) mutable {
56 if( i == 0 || i == n-1 ) // boundary nodes -> boundary conditions
57 fu[ i ] = 0.0;
58 else // interior nodes -> approximation of the second derivative
59 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
60 };
61 auto time_stepping = [=] ( const Real& t, const Real& tau, const VectorView& u, VectorView& fu ) {
63 solver.solve( u, time_stepping );
64 write( file, u, n, h, solver.getTime() ); // write the current state to a file
65 }
66}
67
68int main( int argc, char* argv[] )
69{
70 TNL::String file_name( argv[ 1 ] );
71 file_name += "/ODESolver-HeatEquationExample-result.out";
72
73 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
74#ifdef HAVE_CUDA
75 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
76#endif
77}
Solver of ODEs with the first order of accuracy.
Definition: Euler.h:46

The solver is implemented in separate function solveHeatEquation (line 11) having one template parameter Device (line 10) defining on what device the solver is supposed to be executed. We first define necessary types:

  1. Vector (line 13) is alias for TNL::Containers::Vector. The number of DOFs can be large and therefore they are stored in the resizable dynamically allocated vector TNL::Containers::Vector rather then in static vector - TNL::Containers::StaticVector.
  2. VectorView (line 14) will be used for accessing DOFs within the lambda functions.
  3. ODESolver (line 15) is type of the ODE solver which we will use for the computation of the time evolution in the time dependent heat equation.

Next we define parameters of the discretization:

  1. final_t (line 20) represents the size of the time interval \( [0,T] \).
  2. output_time_step (line 21) defines time intervals in which we will write the solution \( u \) into a file.
  3. n (line 22) is the number of DOFs, i.e. number of nodes used for the finite difference method.
  4. h (line 23) is the space step, i.e. distance between two consecutive nodes.
  5. tau (line 24) is the time step used for the integration by the ODE solver. Since we solve the second order parabolic problem, the time step should be proportional to \( h^2 \).
  6. h_sqr_inv (line 25) is auxiliary constant equal to \( 1/h^2 \) which we use later in finite difference for approximation of the second derivative.

Next we set the initial condition \( u_{ini} \) (lines ) which is given as:

\[ u_{ini}(x) = \left\{ \begin{array}{rl} 0 & \rm{for}\ x < 0.4, \\ 1 & \rm{for}\ 0.4 \leq x \leq 0.6, \\ 0 & \rm{for}\ x > 0. \\ \end{array} \right. \]

Next we write the initial condition to a file (lines 37-39) using the function write which we will describe later. On the lines (44-46) we create instance of the ODE solver solver, we set the integration time step tau of the solver (TNL::Solvers::ODE::ExplicitSolver::setTau ) and we set the initial time to zero (TNL::Solvers::ODE::ExplicitSolver::setTime).

Finally, we proceed to the time loop (lines 52-64) but before we prepare counter of the states to be written into files (output_idx). The time loop uses the time variable within the ODE solver (TNL::Solvers::ODE::ExplicitSolver::getTime ) and it iterates until we reach the end of the time interval \( [0, T] \) given by the variable final_t. On the line 54, we set the stop time of the ODE solver (TNL::Solvers::ODE::ExplicitSolver::setStopTime ) to the next checkpoint for storing the state of the heat equation or the end of the time interval depending on what comes first. Next we define the lambda function f expressing the discretization of the second derivative of \( u \) by the central finite difference and the boundary conditions. The function receives the following parameters:

  1. i is the index of the node and the related ODE arising from the method of lines. In fact, we have to evaluate the update of \( u_i^k \) to get to the next time level \( u_i^{k+1} \).
  2. u is vector view representing the state \( u_i^k \) of the heat equation on the \( k- \) time level.
  3. fu is vector of updates or time derivatives in the method of lines which will bring \( u \) to the next time level.

As we mentioned above, since nvcc does not accept lambda functions defined within another lambda function, we have to define f separately and pass the parameters u and fu explicitly (see the line 62).

Now look at the code of the lambda function f. Since the solution \( u \) does not change on the boundaries, we return zero on the boundary nodes (lines 56-57) and we evaluate the central difference for approximation of the second derivative on the interior nodes (line 59).

Next we define the lambda function time_stepping (lines ) which is responsible for computing of the updates for all nodes \( i = 0, \ldots n-1 \). It is done by means of TNL::Algorithms::ParallelFor which iterates over all the nodes and calling the function f on each of them. It passes the vector views u and fu explicitly to f for the reasons we have mentioned above.

Finally, we run the ODE solver (TNL::Solvers::ODE::Euler::solve ) (line 63) and we pass u as the current state of the heat equation and f the lambda function controlling the time evolution to the method solve. On the line 64, we store the current state to a file.

The function write which we use for writing the solution of the heat equation reads as follows:

1#include <fstream>
2
3template< typename Vector,
4 typename Real = typename Vector::RealType,
5 typename Index = typename Vector::IndexType >
6void write( std::fstream& file, const Vector& u, const Index n, const Real& h, const Real& time )
7{
8 file << "# time = " << time << std::endl;
9 for( Index i = 0; i < n; i++ )
10 file << i*h << " " << u.getElement( i ) << std::endl;
11 file << std::endl;
12}
T time(T... args)

The function accepts the following parameters:

  1. file is the file into which we want to store the solution.
  2. u is a vector or vector view representing solution of the heat equation at given time.
  3. n is number of nodes the we use for the approximation of the solution.
  4. h is space step, i.e. distance between two consecutive nodes.
  5. time is the current time of the evolution being computed.

The solver writes the results in the following format:

# time = t[ 0 ]
x[ 0 ] u( t[ 0 ], x[ 0 ] )
x[ 1 ] u( t[ 0 ], x[ 1 ] )
x[ 2 ] u( t[ 0 ], x[ 2 ] )
...
# time = t[ 1 ]
x[ 0 ] u( t[ 1 ], x[ 0 ] )
x[ 1 ] u( t[ 1 ], x[ 1 ] )
x[ 2 ] u( t[ 1 ], x[ 2 ] )
...

The solution can be visualised with Gnuplot as follows:

plot 'ODESolver-HeatEquationExample-result.out' with lines

or it can be easily parsed in Python and processes by Matplotlib using the following script:

1#!/usr/bin/env python3
2
3import sys
4import matplotlib.pyplot as plt
5import numpy as np
6from mpl_toolkits.mplot3d import axes3d, Axes3D
7
8plt.rcParams['text.usetex'] = True
9
10f = open( sys.argv[1], 'r' )
11current_time = 0.0
12time_lst = []
13x_lst = []
14u_lst = []
15x_data = []
16u_data = []
17size = 0
18for line in f:
19 line = line.strip()
20 a = line.split()
21 if not a:
22 continue
23 if a[ 0 ] == "#":
24 if u_lst:
25 time_lst.append( current_time )
26 u_data.append( np.array( u_lst ) )
27 if not x_data:
28 x_data.append( np.array( x_lst ) )
29 x_lst.clear()
30 u_lst.clear()
31
32 current_time = float( a[ 3 ] )
33 time_lst.append( current_time )
34 else:
35 x_lst.append( float( a[ 0 ] ) )
36 u_lst.append( float( a[ 1 ] ) )
37if u_lst:
38 time_lst.append( current_time )
39 u_data.append( np.array( u_lst ) )
40 if not x_data:
41 x_data.append( np.array( x_lst ) )
42 x_lst.clear()
43 u_lst.clear()
44
45
46fig, ax = plt.subplots( 1, 1, figsize=(4, 4) )
47for u in u_data:
48 ax.plot( x_data[ 0 ], u, linewidth=2.0 )
49
50ax.set_ylabel( f'$ u(x) $' )
51ax.set_xlabel( f'$x $' )
52plt.savefig( f"{sys.argv[2]}.png" )
53plt.close(fig)

The result looks as follows:

Setup with a solver monitor

In this section we will show how to connect ODE solver with the solver monitor. Look at the following example:

1#include <iostream>
2#include <TNL/FileName.h>
3#include <TNL/Containers/Vector.h>
4#include <TNL/Solvers/ODE/Euler.h>
5#include <TNL/Solvers/IterativeSolverMonitor.h>
6#include "write.h"
7
8using Real = double;
9using Index = int;
10
11template< typename Device >
12void solveHeatEquation( const char* file_name )
13{
15 using VectorView = typename Vector::ViewType;
16 using ODESolver = TNL::Solvers::ODE::Euler< Vector >;
17
18 /***
19 * Parameters of the discertisation
20 */
21 const Real final_t = 0.05;
22 const Real output_time_step = 0.005;
23 const Index n = 41;
24 const Real h = 1.0 / ( n - 1 );
25 const Real tau = 0.001 * h * h;
26 const Real h_sqr_inv = 1.0 / ( h * h );
27
28 /***
29 * Initial condition
30 */
31 Vector u( n );
32 u.forAllElements( [=] __cuda_callable__ ( Index i, Real& value ) {
33 const Real x = i * h;
34 if( x >= 0.4 && x <= 0.6 )
35 value = 1.0;
36 else value = 0.0;
37 } );
38 std::fstream file;
39 file.open( file_name, std::ios::out );
40 write( file, u, n, h, ( Real ) 0.0 );
41
42 /***
43 * Setup monitor for the ODE solver.
44 */
45 using IterativeSolverMonitorType = TNL::Solvers::IterativeSolverMonitor< Real, Index >;
46 IterativeSolverMonitorType monitor;
47 TNL::Solvers::SolverMonitorThread monitorThread(monitor);
48 monitor.setRefreshRate(10); // refresh rate in miliseconds
49 monitor.setVerbose(1);
50 monitor.setStage( "ODE solver stage:" );
51
52 /***
53 * Setup of the solver
54 */
55 ODESolver solver;
56 solver.setTau( tau );
57 solver.setTime( 0.0 );
58 solver.setSolverMonitor( monitor );
59
60 /***
61 * Time loop
62 */
63 Index output_idx( 1 );
64 while( solver.getTime() < final_t )
65 {
66 solver.setStopTime( TNL::min( solver.getTime() + output_time_step, final_t ) );
67 auto f = [=] __cuda_callable__ ( Index i, const VectorView& u, VectorView& fu ) mutable {
68 if( i == 0 || i == n-1 ) // boundary nodes -> boundary conditions
69 fu[ i ] = 0.0;
70 else // interior nodes -> approximation of the second derivative
71 fu[ i ] = h_sqr_inv * ( u[ i - 1 ] - 2.0 * u[ i ] + u[ i + 1 ] );
72 };
73 auto time_stepping = [=] ( const Real& t, const Real& tau, const VectorView& u, VectorView& fu ) {
75 solver.solve( u, time_stepping );
76 write( file, u, n, h, solver.getTime() ); // write the current state to a file
77 }
78 monitor.stopMainLoop();
79}
80
81int main( int argc, char* argv[] )
82{
83 TNL::String file_name( argv[ 1 ] );
84 file_name += "/ODESolver-HeatEquationExample-result.out";
85
86 solveHeatEquation< TNL::Devices::Host >( file_name.getString() );
87#ifdef HAVE_CUDA
88 solveHeatEquation< TNL::Devices::Cuda >( file_name.getString() );
89#endif
90}
A RAII wrapper for launching the SolverMonitor's main loop in a separate thread.
Definition: SolverMonitor.h:137

There are the following differences compared to the previous example:

  1. We have to include a header file with the iterative solver monitor (line 5).
  2. We have to setup the solver monitor (lines 45-50). First, we define the monitor type (line 45) and we create an instance of the monitor (line 46). Next we create a separate thread for the monitor (line 47), set the refresh rate to 10 milliseconds (line 48), turn on the verbose mode (line 49) and set the solver stage name (line 50). On the line 58, we connect the monitor with the solver using method TNL::Solvers::IterativeSolver::setSolverMonitor. We stop the monitor after the ODE solver finishes by calling TNL::Solvers::IterativeSolverMonitor::stopMainLoop in the line 78.

The result looks as follows:

ODE solver stage: ITER: 1795 RES: 7.0538
ODE solver stage: ITER: 3532 RES: 2.9097
ODE solver stage: ITER: 4741 RES: 1.6318
ODE solver stage: ITER: 6544 RES: 1.4327
ODE solver stage: ITER: 5238 RES: 18.562
ODE solver stage: ITER: 2455 RES: 11.389
ODE solver stage: ITER: 7700 RES: 8.0134
ODE solver stage: ITER: 4943 RES: 5.9451
ODE solver stage: ITER: 2167 RES: 4.5169
ODE solver stage: ITER: 6638 RES: 3.6383
ODE solver stage: ITER: 1966 RES: 3.1144
ODE solver stage: ITER: 7229 RES: 2.4843
ODE solver stage: ITER: 4285 RES: 2.0583
ODE solver stage: ITER: 7971 RES: 1.8336
ODE solver stage: ITER: 4223 RES: 1.6487
ODE solver stage: ITER: 1514 RES: 1.5119
ODE solver stage: ITER: 6820 RES: 1.4564
ODE solver stage: ITER: 4088 RES: 1.4382
ODE solver stage: ITER: 1360 RES: 1.424
ODE solver stage: ITER: 6660 RES: 1.4022