Template Numerical Library version develop:1f26cbe9a
Linear solvers tutorial

Introduction

Solvers of linear systems are one of the most important algorithms in scientific computations. TNL offers the followiing iterative methods:

  1. Stationary methods
    1. Jacobi method (TNL::Solvers::Linear::Jacobi)
    2. Successive-overrelaxation method, SOR (TNL::Solvers::Linear::SOR)
  2. Krylov subspace methods
    1. Conjugate gradient method, CG (TNL::Solvers::Linear::CG)
    2. Biconjugate gradient stabilized method, BICGStab (TNL::Solvers::Linear::BICGStab)
    3. Biconjugate gradient stabilized method, BICGStab(l) (TNL::Solvers::Linear::BICGStabL)
    4. Transpose-free quasi-minimal residual method, TFQMR (TNL::Solvers::Linear::TFQMR)
    5. Generalized minimal residual method, GMRES (TNL::Solvers::Linear::GMRES) with various methods of orthogonalization
      1. Classical Gramm-Schmidt, CGS
      2. Classical Gramm-Schmidt with reorthogonalization, CGSR
      3. Modified Gramm-Schmidt, MGS
      4. Modified Gramm-Schmidt with reorthogonalization, MGSR
      5. Compact WY form of the Householder reflections, CWY

The iterative solvers (not the stationary solvers like TNL::Solvers::Linear::Jacobi and TNL::Solevrs::Linear::SOR) can be combined with the following preconditioners:

  1. Diagonal or Jacobi - TNL::Solvers::Linear::Preconditioners::Diagonal
  2. ILU (Incomplete LU) - CPU only currently
    1. ILU(0) TNL::Solvers::Linear::Preconditioners::ILU0
    2. ILUT (ILU with thresholding) TNL::Solvers::Linear::Preconditioners::ILUT

Iterative solvers of linear systems

Basic setup

All iterative solvers for linear systems can be found in the namespace TNL::Solvers::Linear. The following example shows the use the iterative solvers:

1#include <iostream>
2#include <memory>
3#include <TNL/Algorithms/ParallelFor.h>
4#include <TNL/Matrices/SparseMatrix.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7#include <TNL/Solvers/Linear/TFQMR.h>
8
9template< typename Device >
10void iterativeLinearSolverExample()
11{
12 /***
13 * Set the following matrix (dots represent zero matrix elements):
14 *
15 * / 2.5 -1 . . . \
16 * | -1 2.5 -1 . . |
17 * | . -1 2.5 -1. . |
18 * | . . -1 2.5 -1 |
19 * \ . . . -1 2.5 /
20 */
23 const int size( 5 );
24 auto matrix_ptr = std::make_shared< MatrixType >();
25 matrix_ptr->setDimensions( size, size );
26 matrix_ptr->setRowCapacities( Vector( { 2, 3, 3, 3, 2 } ) );
27
28 auto f = [=] __cuda_callable__ ( typename MatrixType::RowView& row ) mutable {
29 const int rowIdx = row.getRowIndex();
30 if( rowIdx == 0 )
31 {
32 row.setElement( 0, rowIdx, 2.5 ); // diagonal element
33 row.setElement( 1, rowIdx+1, -1 ); // element above the diagonal
34 }
35 else if( rowIdx == size - 1 )
36 {
37 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
38 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
39 }
40 else
41 {
42 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
43 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
44 row.setElement( 2, rowIdx+1, -1.0 ); // element above the diagonal
45 }
46 };
47
48 /***
49 * Set the matrix elements.
50 */
51 matrix_ptr->forAllRows( f );
52 std::cout << *matrix_ptr << std::endl;
53
54 /***
55 * Set the right-hand side vector.
56 */
57 Vector x( size, 1.0 );
58 Vector b( size );
59 matrix_ptr->vectorProduct( x, b );
60 x = 0.0;
61 std::cout << "Vector b = " << b << std::endl;
62
63 /***
64 * Solve the linear system.
65 */
67 LinearSolver solver;
68 solver.setMatrix( matrix_ptr );
69 solver.setConvergenceResidue( 1.0e-6 );
70 solver.solve( b, x );
71 std::cout << "Vector x = " << x << std::endl;
72}
73
74int main( int argc, char* argv[] )
75{
76 std::cout << "Solving linear system on host: " << std::endl;
77 iterativeLinearSolverExample< TNL::Devices::Sequential >();
78
79#ifdef HAVE_CUDA
80 std::cout << "Solving linear system on CUDA device: " << std::endl;
81 iterativeLinearSolverExample< TNL::Devices::Cuda >();
82#endif
83}
Vector extends Array with algebraic operations.
Definition: Vector.h:40
Implementation of sparse matrix, i.e. matrix storing only non-zero elements.
Definition: SparseMatrix.h:50
void setMatrix(const MatrixPointer &matrix)
Set the matrix of the linear system.
Definition: LinearSolver.h:127
Iterative solver of linear systems based on the Transpose-free quasi-minimal residual (TFQMR) method.
Definition: TFQMR.h:26
T endl(T... args)

In this example we solve a linear system \( A \vec x = \vec b \) where

\[ A = \left( \begin{array}{cccc} 2.5 & -1 & & & \\ -1 & 2.5 & -1 & & \\ & -1 & 2.5 & -1 & \\ & & -1 & 2.5 & -1 \\ & & & -1 & 2.5 \\ \end{array} \right) \]

The right-hand side vector \(\vec b \) is set to \(( 1.5, 0.5, 0.5, 0.5, 1.5 )^T \) so that the exact solution is \( \vec x = ( 1, 1, 1, 1, 1 )^T\). The matrix elements of \(A $\) is set on the lines 12-51 by the means of the method TNL::Matrices::SparseMatrix::forAllElements. In this example, we use the sparse matrix but any other matrix type can be used as well (see the namespace TNL::Matrices). Next we set the solution vector \( \vec x = ( 1, 1, 1, 1, 1 )^T\) (line 57) and multiply it with matrix \( A \) to get the right-hand side vector \(\vec b\) (lines 58-59). Finally, we reset the vector \(\vec x \) to zero vector.

To solve the linear system, we use TFQMR method (line 66), as an example. Other solvers can be used as well (see the namespace TNL::Solvers::Linear). The solver needs only one template parameter which is the matrix type. Next we create an instance of the solver (line 67 ) and set the matrix of the linear system (line 68). Note, that matrix is passed to the solver as a shared smart pointer (std::shared_ptr). This is why we created an instance of the smart pointer on the line 24 instead of the sparse matrix itself. On the line 69, we set the value of residue under which the convergence occur and the solver is stopped. It serves as a convergence criterion. The solver is executed on the line 70 by calling the method TNL::Solvers::Linear::LinearSolver::solve. The method accepts the right-hand side vector \( \vec b\) and the solution vector \( \vec x\).

The result looks as follows:

Solving linear system on host:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]
Solving linear system on CUDA device:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]

Setup with a solver monitor

Solution of large linear systems may take a lot of time. In such situations, it is useful to be able to monitor the convergence of the solver of the solver status in general. For this purpose, TNL offers solver monitors. The solver monitor prints (or somehow visualizes) the number of iterations, the residue of the current solution approximation or some other metrics. Sometimes such information is printed after each iteration or after every ten iterations. The problem of this approach is the fact that one iteration of the solver may take only few milliseconds but also several minutes. In the former case, the monitor creates overwhelming amount of output which may even slowdown the solver. In the later case, the user waits long time for update of the solver status. The monitor in TNL rather runs in separate thread and it refreshes the status of the solver in preset time periods. The use of the iterative solver monitor is demonstrated in the following example.

1#include <iostream>
2#include <memory>
3#include <chrono>
4#include <thread>
5#include <TNL/Algorithms/ParallelFor.h>
6#include <TNL/Matrices/SparseMatrix.h>
7#include <TNL/Devices/Sequential.h>
8#include <TNL/Devices/Cuda.h>
9#include <TNL/Solvers/Linear/Jacobi.h>
10
11template< typename Device >
12void iterativeLinearSolverExample()
13{
14 /***
15 * Set the following matrix (dots represent zero matrix elements):
16 *
17 * / 2.5 -1 . . . \
18 * | -1 2.5 -1 . . |
19 * | . -1 2.5 -1. . |
20 * | . . -1 2.5 -1 |
21 * \ . . . -1 2.5 /
22 */
25 const int size( 5 );
26 auto matrix_ptr = std::make_shared< MatrixType >();
27 matrix_ptr->setDimensions( size, size );
28 matrix_ptr->setRowCapacities( Vector( { 2, 3, 3, 3, 2 } ) );
29
30 auto f = [=] __cuda_callable__ ( typename MatrixType::RowView& row ) mutable {
31 const int rowIdx = row.getRowIndex();
32 if( rowIdx == 0 )
33 {
34 row.setElement( 0, rowIdx, 2.5 ); // diagonal element
35 row.setElement( 1, rowIdx+1, -1 ); // element above the diagonal
36 }
37 else if( rowIdx == size - 1 )
38 {
39 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
40 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
41 }
42 else
43 {
44 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
45 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
46 row.setElement( 2, rowIdx+1, -1.0 ); // element above the diagonal
47 }
48 };
49
50 /***
51 * Set the matrix elements.
52 */
53 matrix_ptr->forAllRows( f );
54 std::cout << *matrix_ptr << std::endl;
55
56 /***
57 * Set the right-hand side vector.
58 */
59 Vector x( size, 1.0 );
60 Vector b( size );
61 matrix_ptr->vectorProduct( x, b );
62 x = 0.0;
63 std::cout << "Vector b = " << b << std::endl;
64
65 /***
66 * Setup solver of the linear system.
67 */
69 LinearSolver solver;
70 solver.setMatrix( matrix_ptr );
71 solver.setOmega( 0.0005 );
72
73 /***
74 * Setup monitor of the iterative solver.
75 */
76 using IterativeSolverMonitorType = TNL::Solvers::IterativeSolverMonitor< double, int >;
77 IterativeSolverMonitorType monitor;
78 TNL::Solvers::SolverMonitorThread monitorThread(monitor);
79 monitor.setRefreshRate(10); // refresh rate in milliseconds
80 monitor.setVerbose(1);
81 monitor.setStage( "Jacobi stage:" );
82 solver.setSolverMonitor(monitor);
83 solver.setConvergenceResidue( 1.0e-6 );
84 solver.solve( b, x );
85 monitor.stopMainLoop();
86 std::cout << "Vector x = " << x << std::endl;
87}
88
89int main( int argc, char* argv[] )
90{
91 std::cout << "Solving linear system on host: " << std::endl;
92 iterativeLinearSolverExample< TNL::Devices::Host >();
93
94#ifdef HAVE_CUDA
95 std::cout << "Solving linear system on CUDA device: " << std::endl;
96 iterativeLinearSolverExample< TNL::Devices::Cuda >();
97#endif
98}
Iterative solver of linear systems based on the Jacobi method.
Definition: Jacobi.h:26
A RAII wrapper for launching the SolverMonitor's main loop in a separate thread.
Definition: SolverMonitor.h:137

On the lines 1-70, we setup the same linear system as in the previous example, we create an instance of the Jacobi solver and we pass the matrix of the linear system to the solver. On the line 71, we set the relaxation parameter \( \omega \) of the Jacobi solver to 0.0005 (TNL::Solvers::Linear::Jacobi). The reason is to slowdown the convergence because we want to see some iterations in this example. Next we create an instance of the solver monitor (lines 76 and 77) and we create a special thread for the monitor (line 78, TNL::Solvers::SolverMonitorThread ). We set the refresh rate of the monitor to 10 milliseconds (line 79, TNL::Solvers::SolverMonitor::setRefreshRate). We set a verbosity of the monitor to 1 (line 80 TNL::Solvers::IterativeSolverMonitor::setVerbosity ). Next we set a name of the solver stage (line 81, TNL::Solvers::IterativeSolverMonitor::setStage). The monitor stages serve for distinguishing between different phases or stages of more complex solvers (for example when the linear system solver is embedded into a time dependent PDE solver). Next we connect the solver with the monitor (line 82, TNL::Solvers::IterativeSolver::setSolverMonitor) and we set the convergence criterion based on the value of the residue (line 83). Finally we start the solver (line 84, TNL::Solvers::Linear::Jacobi::start) and when the solver finishes we have to stop the monitor (line 84, TNL::Solvers::SolverMonitor::stopMainLoop).

The result looks as follows:

Solving linear system on host:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Jacobi stage: ITER: 10875 RES: 0.13598
Jacobi stage: ITER: 21947 RES: 0.024819
Jacobi stage: ITER: 32945 RES: 0.0045814
Jacobi stage: ITER: 44001 RES: 0.00083844
Jacobi stage: ITER: 55089 RES: 0.00015269
Jacobi stage: ITER: 65959 RES: 2.8745e-05
Jacobi stage: ITER: 77041 RES: 5.2412e-06
Vector x = [ 0.999999, 0.999999, 0.999998, 0.999999, 0.999999 ]
Solving linear system on CUDA device:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Jacobi stage: ITER: 5607 RES: 0.30775
Jacobi stage: ITER: 11267 RES: 0.12803
Jacobi stage: ITER: 16931 RES: 0.053629
Jacobi stage: ITER: 22587 RES: 0.022496
Jacobi stage: ITER: 28211 RES: 0.0094827
Jacobi stage: ITER: 33839 RES: 0.0039948
Jacobi stage: ITER: 39487 RES: 0.0016778
Jacobi stage: ITER: 45155 RES: 0.00070246
Jacobi stage: ITER: 50817 RES: 0.0002943
Jacobi stage: ITER: 56473 RES: 0.00012345
Jacobi stage: ITER: 62125 RES: 5.1814e-05
Jacobi stage: ITER: 67759 RES: 2.1815e-05
Jacobi stage: ITER: 73415 RES: 9.1505e-06
Jacobi stage: ITER: 79075 RES: 3.836e-06
Jacobi stage: ITER: 84731 RES: 1.6091e-06
Vector x = [ 0.999999, 0.999999, 0.999998, 0.999999, 0.999999 ]

The monitoring of the solver can be improved by time elapsed since the beginning of the computation as demonstrated in the following example:

1#include <iostream>
2#include <memory>
3#include <chrono>
4#include <thread>
5#include <TNL/Timer.h>
6#include <TNL/Algorithms/ParallelFor.h>
7#include <TNL/Matrices/SparseMatrix.h>
8#include <TNL/Devices/Sequential.h>
9#include <TNL/Devices/Cuda.h>
10#include <TNL/Solvers/Linear/Jacobi.h>
11
12template< typename Device >
13void iterativeLinearSolverExample()
14{
15 /***
16 * Set the following matrix (dots represent zero matrix elements):
17 *
18 * / 2.5 -1 . . . \
19 * | -1 2.5 -1 . . |
20 * | . -1 2.5 -1. . |
21 * | . . -1 2.5 -1 |
22 * \ . . . -1 2.5 /
23 */
26 const int size( 5 );
27 auto matrix_ptr = std::make_shared< MatrixType >();
28 matrix_ptr->setDimensions( size, size );
29 matrix_ptr->setRowCapacities( Vector( { 2, 3, 3, 3, 2 } ) );
30
31 auto f = [=] __cuda_callable__ ( typename MatrixType::RowView& row ) mutable {
32 const int rowIdx = row.getRowIndex();
33 if( rowIdx == 0 )
34 {
35 row.setElement( 0, rowIdx, 2.5 ); // diagonal element
36 row.setElement( 1, rowIdx+1, -1 ); // element above the diagonal
37 }
38 else if( rowIdx == size - 1 )
39 {
40 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
41 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
42 }
43 else
44 {
45 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
46 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
47 row.setElement( 2, rowIdx+1, -1.0 ); // element above the diagonal
48 }
49 };
50
51 /***
52 * Set the matrix elements.
53 */
54 matrix_ptr->forAllRows( f );
55 std::cout << *matrix_ptr << std::endl;
56
57 /***
58 * Set the right-hand side vector.
59 */
60 Vector x( size, 1.0 );
61 Vector b( size );
62 matrix_ptr->vectorProduct( x, b );
63 x = 0.0;
64 std::cout << "Vector b = " << b << std::endl;
65
66 /***
67 * Setup solver of the linear system.
68 */
70 LinearSolver solver;
71 solver.setMatrix( matrix_ptr );
72 solver.setOmega( 0.0005 );
73
74 /***
75 * Setup monitor of the iterative solver.
76 */
77 using IterativeSolverMonitorType = TNL::Solvers::IterativeSolverMonitor< double, int >;
78 IterativeSolverMonitorType monitor;
79 TNL::Solvers::SolverMonitorThread mmonitorThread(monitor);
80 monitor.setRefreshRate(10); // refresh rate in milliseconds
81 monitor.setVerbose(1);
82 monitor.setStage( "Jacobi stage:" );
83 TNL::Timer timer;
84 monitor.setTimer( timer );
85 timer.start();
86 solver.setSolverMonitor(monitor);
87 solver.setConvergenceResidue( 1.0e-6 );
88 solver.solve( b, x );
89 monitor.stopMainLoop();
90 std::cout << "Vector x = " << x << std::endl;
91}
92
93int main( int argc, char* argv[] )
94{
95 std::cout << "Solving linear system on host: " << std::endl;
96 iterativeLinearSolverExample< TNL::Devices::Sequential >();
97
98#ifdef HAVE_CUDA
99 std::cout << "Solving linear system on CUDA device: " << std::endl;
100 iterativeLinearSolverExample< TNL::Devices::Cuda >();
101#endif
102}
Class for real time, CPU time and CPU cycles measuring.
Definition: Timer.h:28
void start()
Starts timer.
Definition: Timer.hpp:49

The only changes happen on lines 83-85 where we create an instance of TNL timer (line 83, TNL::Timer), connect it with the monitor (line 84, TNL::Solvers::SolverMonitor::setTimer) and start the timer (line 85, TNL::Timer::start).

The result looks as follows:

Solving linear system on host:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
ELA: 3.3e-06
ELA: 0.1001 Jacobi stage: ITER: 25043 RES: 0.015417
ELA: 0.2002 Jacobi stage: ITER: 50317 RES: 0.00031779
ELA: 0.3003 Jacobi stage: ITER: 75615 RES: 6.5266e-06
Vector x = [ 0.999999, 0.999999, 0.999998, 0.999999, 0.999999 ]
Solving linear system on CUDA device:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
ELA:1.0327e-
ELA: 0.1001 Jacobi stage: ITER: 5495 RES: 0.31328
ELA: 0.2002 Jacobi stage: ITER: 10995 RES: 0.13349
ELA: 0.30031 Jacobi stage: ITER: 16471 RES: 0.057556
ELA: 0.40042 Jacobi stage: ITER: 21711 RES: 0.02572
ELA: 0.50053 Jacobi stage: ITER: 26999 RES: 0.011423
ELA: 0.60062 Jacobi stage: ITER: 32515 RES: 0.0048957
ELA: 0.70072 Jacobi stage: ITER: 37623 RES: 0.0022339
ELA: 0.80081 Jacobi stage: ITER: 42851 RES: 0.0010007
ELA: 0.90091 Jacobi stage: ITER: 48393 RES: 0.00042706
ELA: 1.001 Jacobi stage: ITER: 53935 RES: 0.00018236
ELA: 1.1011 Jacobi stage: ITER: 59491 RES: 7.7677e-05
ELA: 1.2012 Jacobi stage: ITER: 65023 RES: 3.3189e-05
ELA: 1.3013 Jacobi stage: ITER: 70555 RES: 1.4198e-05
ELA: 1.4014 Jacobi stage: ITER: 76101 RES: 6.0553e-06
ELA: 1.5016 Jacobi stage: ITER: 81635 RES: 2.5888e-06
ELA: 1.6017 Jacobi stage: ITER: 87159 RES: 1.1082e-06
Vector x = [ 0.999999, 0.999999, 0.999998, 0.999999, 0.999999 ]

Setup with preconditioner

Preconditioners of iterative solvers can significantly improve the performance of the solver. In the case of the linear systems, they are used mainly with the Krylov subspace methods. Preconditioners cannot be used with the starionary methods (TNL::Solvers::Linear::Jacobi and TNL::Solvers::Linear::SOR). The following example shows how to setup an iterative solver of linear systems with preconditioning.

1#include <iostream>
2#include <memory>
3#include <TNL/Algorithms/ParallelFor.h>
4#include <TNL/Matrices/SparseMatrix.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7#include <TNL/Solvers/Linear/TFQMR.h>
8#include <TNL/Solvers/Linear/Preconditioners/Diagonal.h>
9
10template< typename Device >
11void iterativeLinearSolverExample()
12{
13 /***
14 * Set the following matrix (dots represent zero matrix elements):
15 *
16 * / 2.5 -1 . . . \
17 * | -1 2.5 -1 . . |
18 * | . -1 2.5 -1. . |
19 * | . . -1 2.5 -1 |
20 * \ . . . -1 2.5 /
21 */
24 const int size( 5 );
25 auto matrix_ptr = std::make_shared< MatrixType >();
26 matrix_ptr->setDimensions( size, size );
27 matrix_ptr->setRowCapacities( Vector( { 2, 3, 3, 3, 2 } ) );
28
29 auto f = [=] __cuda_callable__ ( typename MatrixType::RowView& row ) mutable {
30 const int rowIdx = row.getRowIndex();
31 if( rowIdx == 0 )
32 {
33 row.setElement( 0, rowIdx, 2.5 ); // diagonal element
34 row.setElement( 1, rowIdx+1, -1 ); // element above the diagonal
35 }
36 else if( rowIdx == size - 1 )
37 {
38 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
39 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
40 }
41 else
42 {
43 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
44 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
45 row.setElement( 2, rowIdx+1, -1.0 ); // element above the diagonal
46 }
47 };
48
49 /***
50 * Set the matrix elements.
51 */
52 matrix_ptr->forAllRows( f );
53 std::cout << *matrix_ptr << std::endl;
54
55 /***
56 * Set the right-hand side vector.
57 */
58 Vector x( size, 1.0 );
59 Vector b( size );
60 matrix_ptr->vectorProduct( x, b );
61 x = 0.0;
62 std::cout << "Vector b = " << b << std::endl;
63
64 /***
65 * Solve the linear system using diagonal (Jacobi) preconditioner.
66 */
69 auto preconditioner_ptr = std::make_shared< Preconditioner >();
70 preconditioner_ptr->update( matrix_ptr );
71 LinearSolver solver;
72 solver.setMatrix( matrix_ptr );
73 solver.setPreconditioner( preconditioner_ptr );
74 solver.setConvergenceResidue( 1.0e-6 );
75 solver.solve( b, x );
76 std::cout << "Vector x = " << x << std::endl;
77}
78
79int main( int argc, char* argv[] )
80{
81 std::cout << "Solving linear system on host: " << std::endl;
82 iterativeLinearSolverExample< TNL::Devices::Sequential >();
83
84#ifdef HAVE_CUDA
85 std::cout << "Solving linear system on CUDA device: " << std::endl;
86 iterativeLinearSolverExample< TNL::Devices::Cuda >();
87#endif
88}
Diagonal (Jacobi) preconditioner for iterative solvers of linear systems.
Definition: Diagonal.h:29

In this example, we solve the same problem as in all other examples in this section. The only differences concerning the preconditioner happen on the lines (68-72). Similar to the matrix of the linear system, the preconditioner is passed to the solver by the means of smart shared pointer (std::shared_ptr). The instance is created on the lines 68 and 69. Next we have to initialize the preconditioner (line 70, TNL::Solvers::Linear::Preconditioners::Preconditioner::update). The method update has to be called everytime the matrix of the linear system changes. This is important for example when solving time dependent PDEs but it does not happen in this example. Finally, we need to connect the solver with the preconditioner (line 73, TNL::Solvers::Linear::LinearSolver).

The result looks as follows:

Solving linear system on host:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]
Solving linear system on CUDA device:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]

Choosing the solver and preconditioner type at runtime

When developing a numerical solver, one often has to search for a combination of various methods and algorithms that fit given requirements the best. To make this easier, TNL offers choosing the type of both linear solver and preconditioner at runtime by means of functions TNL::Solvers::getLinearSolver and TNL::Solvers::getPreconditioner. The following example shows how to use these functions:

1#include <iostream>
2#include <memory>
3#include <TNL/Algorithms/ParallelFor.h>
4#include <TNL/Matrices/SparseMatrix.h>
5#include <TNL/Devices/Host.h>
6#include <TNL/Devices/Cuda.h>
7#include <TNL/Solvers/LinearSolverTypeResolver.h>
8
9template< typename Device >
10void iterativeLinearSolverExample()
11{
12 /***
13 * Set the following matrix (dots represent zero matrix elements):
14 *
15 * / 2.5 -1 . . . \
16 * | -1 2.5 -1 . . |
17 * | . -1 2.5 -1. . |
18 * | . . -1 2.5 -1 |
19 * \ . . . -1 2.5 /
20 */
23 const int size( 5 );
24 auto matrix_ptr = std::make_shared< MatrixType >();
25 matrix_ptr->setDimensions( size, size );
26 matrix_ptr->setRowCapacities( Vector( { 2, 3, 3, 3, 2 } ) );
27
28 auto f = [=] __cuda_callable__ ( typename MatrixType::RowView& row ) mutable {
29 const int rowIdx = row.getRowIndex();
30 if( rowIdx == 0 )
31 {
32 row.setElement( 0, rowIdx, 2.5 ); // diagonal element
33 row.setElement( 1, rowIdx+1, -1 ); // element above the diagonal
34 }
35 else if( rowIdx == size - 1 )
36 {
37 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
38 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
39 }
40 else
41 {
42 row.setElement( 0, rowIdx-1, -1.0 ); // element below the diagonal
43 row.setElement( 1, rowIdx, 2.5 ); // diagonal element
44 row.setElement( 2, rowIdx+1, -1.0 ); // element above the diagonal
45 }
46 };
47
48 /***
49 * Set the matrix elements.
50 */
51 matrix_ptr->forAllRows( f );
52 std::cout << *matrix_ptr << std::endl;
53
54 /***
55 * Set the right-hand side vector.
56 */
57 Vector x( size, 1.0 );
58 Vector b( size );
59 matrix_ptr->vectorProduct( x, b );
60 x = 0.0;
61 std::cout << "Vector b = " << b << std::endl;
62
63 /***
64 * Solve the linear system using diagonal (Jacobi) preconditioner.
65 */
66 auto solver_ptr = TNL::Solvers::getLinearSolver< MatrixType >( "tfqmr" );
67 auto preconditioner_ptr = TNL::Solvers::getPreconditioner< MatrixType >( "diagonal");
68 preconditioner_ptr->update( matrix_ptr );
69 solver_ptr->setMatrix( matrix_ptr );
70 solver_ptr->setPreconditioner( preconditioner_ptr );
71 solver_ptr->setConvergenceResidue( 1.0e-6 );
72 solver_ptr->solve( b, x );
73 std::cout << "Vector x = " << x << std::endl;
74}
75
76int main( int argc, char* argv[] )
77{
78 std::cout << "Solving linear system on host: " << std::endl;
79 iterativeLinearSolverExample< TNL::Devices::Sequential >();
80
81#ifdef HAVE_CUDA
82 std::cout << "Solving linear system on CUDA device: " << std::endl;
83 iterativeLinearSolverExample< TNL::Devices::Cuda >();
84#endif
85}

We still stay with the same problem and the only changes can be seen on lines 66-70. We first create an instance of shared pointer holding the solver (line 66, TNL::Solvers::getLinearSolver) and the same with the preconditioner (line 67, TNL::Solvers::getPreconditioner). The rest of the code is the same as in the previous examples with the only difference that we work with the pointer solver_ptr instead of the direct instance solver of the solver type.

The result looks as follows:

Solving linear system on host:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]
Solving linear system on CUDA device:
Row: 0 -> 0:2.5 1:-1
Row: 1 -> 0:-1 1:2.5 2:-1
Row: 2 -> 1:-1 2:2.5 3:-1
Row: 3 -> 2:-1 3:2.5 4:-1
Row: 4 -> 3:-1 4:2.5
Vector b = [ 1.5, 0.5, 0.5, 0.5, 1.5 ]
Vector x = [ 1, 1, 1, 1, 1 ]