diff --git a/Documentation/Pages/comparison-with-other-libraries.md b/Documentation/Pages/comparison-with-other-libraries.md
index 755110a7204fb52c5e8bcac85ea6a62b902c7eba..d47e3ede57c4d49f7f03d8edfffb1c1be1957571 100644
--- a/Documentation/Pages/comparison-with-other-libraries.md
+++ b/Documentation/Pages/comparison-with-other-libraries.md
@@ -1,4 +1,4 @@
-\page comparison_with_other_libraries  Comparison with other libraries
+# Comparison with other libraries
 
 ## Memory space and execution model
 
diff --git a/Documentation/Pages/core-concepts.md b/Documentation/Pages/core-concepts.md
index de99eff9e60a053daf065c2558024a502b2e6700..92a0d4f7190dc55a2c77e087f20aed7ca368a882 100644
--- a/Documentation/Pages/core-concepts.md
+++ b/Documentation/Pages/core-concepts.md
@@ -1,4 +1,4 @@
-\page core_concepts  Core concepts
+# Core concepts
 
 TNL is based on the following core concepts:
 
diff --git a/Documentation/Pages/main-page.md b/Documentation/Pages/main-page.md
index 7f77793947805e8f3cfcd73bc6713719e0a1ba26..cf595614c9411995ac5668d52ed4751dc7cf6bac 100644
--- a/Documentation/Pages/main-page.md
+++ b/Documentation/Pages/main-page.md
@@ -23,7 +23,7 @@ several modules:
   either the host CPU or an accelerator (GPU), and for each there are many ways
   to manage parallel execution. The usage of memory spaces is abstracted with
   \ref TNL::Allocators "allocators" and the execution model is represented by
-  \ref TNL::Devices "devices". See the \ref core_concepts "Core concepts" page
+  \ref TNL::Devices "devices". See the [Core concepts](core-concepts.md) page
   for details.
 - \ref TNL::Containers "Containers".
   TNL provides generic containers such as array, multidimensional array or array
@@ -49,7 +49,7 @@ several modules:
   exports from several file formats such as DICOM, PNG, and JPEG are provided
   using external libraries (see below).
 
-See also \ref comparison_with_other_libraries "Comparison with other libraries".
+See also [Comparison with other libraries](comparison-with-other-libraries.md).
 
 TNL also provides several optional components:
 <a name="optional-components"></a>
diff --git a/Documentation/Tutorials/Arrays/tutorial_Arrays.md b/Documentation/Tutorials/Arrays/tutorial_Arrays.md
index a8405e5f06a0dc25d59d9d57e12ff269270b48c1..8f6a8aeb2b742705533176145e2d10a0ed56eb39 100644
--- a/Documentation/Tutorials/Arrays/tutorial_Arrays.md
+++ b/Documentation/Tutorials/Arrays/tutorial_Arrays.md
@@ -1,4 +1,4 @@
-\page tutorial_Arrays  Arrays tutorial
+# Arrays tutorial
 
 [TOC]
 
@@ -94,7 +94,7 @@ Output:
 
 ### Arrays and flexible reduction
 
-Arrays also offer simpler way to do the flexible parallel reduction. See the section about [the flexible parallel reduction](tutorial_ReductionAndScan.html#flexible_parallel_reduction) to understand how it works. Flexible reduction for arrays just simplifies access to the array elements. See the following example:
+Arrays also offer simpler way to do the flexible parallel reduction. See the section about [the flexible parallel reduction](../ReductionAndScan/tutorial_ReductionAndScan.md) to understand how it works. Flexible reduction for arrays just simplifies access to the array elements. See the following example:
 
 \include ArrayExample_reduceElements.cpp
 
diff --git a/Documentation/Tutorials/ForLoops/tutorial_ForLoops.md b/Documentation/Tutorials/ForLoops/tutorial_ForLoops.md
index 06d8ac3183d1aa98d8c0122b660d1af0c391790b..f03a62d4434297991547f8c3702c9552485f3813 100644
--- a/Documentation/Tutorials/ForLoops/tutorial_ForLoops.md
+++ b/Documentation/Tutorials/ForLoops/tutorial_ForLoops.md
@@ -1,4 +1,4 @@
-\page tutorial_ForLoops For loops
+# For loops tutorial
 
 [TOC]
 
diff --git a/Documentation/Tutorials/GeneralConcepts/tutorial_GeneralConcepts.md b/Documentation/Tutorials/GeneralConcepts/tutorial_GeneralConcepts.md
index d6295c7e8216bb3b53f470197d13e39e0f82accf..71219de188fac9b325c2952120a82fcf72c55c9d 100644
--- a/Documentation/Tutorials/GeneralConcepts/tutorial_GeneralConcepts.md
+++ b/Documentation/Tutorials/GeneralConcepts/tutorial_GeneralConcepts.md
@@ -1,4 +1,4 @@
-\page tutorial_GeneralConcepts General concepts
+# General concepts
 
 [TOC]
 
@@ -46,7 +46,7 @@ In this example, we assume that all arrays `v1`, `v2` and `sum` were properly al
 
 \includelineno snippet_algorithms_and_lambda_functions_reduction.cpp
 
-We will not explain the parallel reduction in TNL at this moment (see the section about [flexible parallel reduction](tutorial_ReductionAndScan.html#flexible_parallel_reduction) ), we hope that the idea is more or less clear from the code snippet. If `Device` equals to \ref TNL::Device::Host , the scalar product is evaluated sequentially or in parallel by several OpenMP threads on CPU, if `Device` equals \ref TNL::Algorithms::Cuda, the [parallel reduction](https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf) fine tuned with the lambda functions is performed. Fortunately, there is no performance drop. On the contrary, since it is easy to generate CUDA kernels for particular situations, we may get more efficient code. Consider computing a scalar product of sum of vectors like this
+We will not explain the parallel reduction in TNL at this moment (see the section about [flexible parallel reduction](../ReductionAndScan/tutorial_ReductionAndScan.md) ), we hope that the idea is more or less clear from the code snippet. If `Device` equals to \ref TNL::Device::Host , the scalar product is evaluated sequentially or in parallel by several OpenMP threads on CPU, if `Device` equals \ref TNL::Algorithms::Cuda, the [parallel reduction](https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf) fine tuned with the lambda functions is performed. Fortunately, there is no performance drop. On the contrary, since it is easy to generate CUDA kernels for particular situations, we may get more efficient code. Consider computing a scalar product of sum of vectors like this
 
 \f[
 s = (u_1 + u_2, v_1 + v_2).
diff --git a/Documentation/Tutorials/Matrices/tutorial_Matrices.md b/Documentation/Tutorials/Matrices/tutorial_Matrices.md
index efcf7a4bfe9dcbabc90983e7c064877ad1c1ef46..c010ccff05f370b6f96316e222466c5ca74867c9 100644
--- a/Documentation/Tutorials/Matrices/tutorial_Matrices.md
+++ b/Documentation/Tutorials/Matrices/tutorial_Matrices.md
@@ -1,4 +1,4 @@
-\page tutorial_Matrices  Matrices tutorial
+# Matrices tutorial
 
 [TOC]
 
@@ -139,7 +139,7 @@ In such a form, it is more efficient to refer the nonzero matrix elements in giv
 
 ## Matrix view
 
-Matrix views are small reference objects which help accessing the matrix in GPU kernels or lambda functions being executed on GPUs. We describe this in details in section about [Shared pointers and views](tutorial_GeneralConcepts.html#shared-pointers-and-views). The problem lies in fact that we cannot pass references to GPU kernels and we do not want to pass there deep copies of matrices. Matrix view is some kind of reference to a matrix. A copy of matrix view is always shallow and so it behaves like a reference.  The following example shows how to obtain the matrix view by means of method `getView` and pass it to a lambda function:
+Matrix views are small reference objects which help accessing the matrix in GPU kernels or lambda functions being executed on GPUs. We describe this in details in section about [Shared pointers and views](../GeneralConcepts/tutorial_GeneralConcepts.md). The problem lies in fact that we cannot pass references to GPU kernels and we do not want to pass there deep copies of matrices. Matrix view is some kind of reference to a matrix. A copy of matrix view is always shallow and so it behaves like a reference.  The following example shows how to obtain the matrix view by means of method `getView` and pass it to a lambda function:
 
 \includelineno SparseMatrixViewExample_getRow.cpp
 
@@ -380,7 +380,7 @@ More efficient way of the matrix initialization on GPU consists of calling the m
 
 \includelineno DenseMatrixViewExample_setElement.cpp
 
-Here we get the matrix view (\ref TNL::Matrices::DenseMatrixView) (line 10) to make the matrix accessible in lambda function even on GPU (see [Shared pointers and views](tutorial_GeneralConcepts.html#shared-pointers-and-views) ). We first call the `setElement` method from CPU to set the `i`-th diagonal element to `i` (lines 11-12). Next we iterate over the matrix rows with `ParallelFor2D` (\ref TNL::Algorithms::ParallelFor2D) (line 20) and for each row we call the lambda function `f`. This is done on the same device where the matrix is allocated and so it we get optimal performance even for matrices on GPU. In the lambda function we add one to each matrix element (line 18). The result looks as follows:
+Here we get the matrix view (\ref TNL::Matrices::DenseMatrixView) (line 10) to make the matrix accessible in lambda function even on GPU (see [Shared pointers and views](../GeneralConcepts/tutorial_GeneralConcepts.md) ). We first call the `setElement` method from CPU to set the `i`-th diagonal element to `i` (lines 11-12). Next we iterate over the matrix rows with `ParallelFor2D` (\ref TNL::Algorithms::ParallelFor2D) (line 20) and for each row we call the lambda function `f`. This is done on the same device where the matrix is allocated and so it we get optimal performance even for matrices on GPU. In the lambda function we add one to each matrix element (line 18). The result looks as follows:
 
 \include DenseMatrixExample_setElement.out
 
@@ -1323,7 +1323,7 @@ We see that in i-th matrix row we have to compute the sum \f$\sum_{j=1}^n a_{ij}
 
 #### Lambda function fetch
 
-This lambda function has the same purpose as the lambda function `fetch` in flexible parallel reduction for arrays and vectors (see [Flexible Parallel Reduction](tutorial_ReductionAndScan.html#flexible_parallel_reduction)). It is supposed to be declared as follows:
+This lambda function has the same purpose as the lambda function `fetch` in flexible parallel reduction for arrays and vectors (see [Flexible Parallel Reduction](../ReductionAndScan/tutorial_ReductionAndScan.md)). It is supposed to be declared as follows:
 
 \includelineno snippet_rows_reduction_fetch_declaration.cpp
 
diff --git a/Documentation/Tutorials/Meshes/tutorial_Meshes.md b/Documentation/Tutorials/Meshes/tutorial_Meshes.md
index 21e982668ec7c6d5d5bc37f9b9c7302334167891..586f4a13f24f3a6594edbc2d7f0e94114815048c 100644
--- a/Documentation/Tutorials/Meshes/tutorial_Meshes.md
+++ b/Documentation/Tutorials/Meshes/tutorial_Meshes.md
@@ -1,4 +1,4 @@
-\page tutorial_Meshes  Unstructured meshes tutorial
+# Unstructured meshes tutorial
 
 [TOC]
 
@@ -71,7 +71,7 @@ Given a mesh instance denoted as `mesh`, it can be used like this:
 
 \snippet MeshIterationExample.cpp getEntitiesCount
 
-Note that this member function and all other member functions presented below are marked as [\_\_cuda\_callable\_\_](tutorial_GeneralConcepts.html), so they can be called from usual host functions as well as CUDA kernels.
+Note that this member function and all other member functions presented below are marked as [\_\_cuda\_callable\_\_](../GeneralConcepts/tutorial_GeneralConcepts.md), so they can be called from usual host functions as well as CUDA kernels.
 
 The entity of given dimension and index can be accessed via a member function template called [getEntity](@ref TNL::Meshes::Mesh::getEntity).
 Again, the entity dimension is specified as a template argument and the index is specified as a method argument.
@@ -124,7 +124,7 @@ For example, the iteration over cells on a mesh allocated on the host can be don
 \snippet MeshIterationExample.cpp Parallel iteration host
 
 The parallel iteration is more complicated for meshes allocated on a GPU, since the lambda expression needs to capture a pointer to the copy of the mesh, which is allocated on the right device.
-This can be achieved with a [smart pointer](tutorial_Pointers.html) as follows:
+This can be achieved with a [smart pointer](../Pointers/tutorial_Pointers.md) as follows:
 
 \snippet ParallelIterationCuda.h Parallel iteration CUDA
 
diff --git a/Documentation/Tutorials/Pointers/tutorial_Pointers.md b/Documentation/Tutorials/Pointers/tutorial_Pointers.md
index 310130c88ce40be2caad97346ac9d2f52dd7fe12..78851ebdbe31781825c2cfc38667633ce66d3b8b 100644
--- a/Documentation/Tutorials/Pointers/tutorial_Pointers.md
+++ b/Documentation/Tutorials/Pointers/tutorial_Pointers.md
@@ -1,4 +1,4 @@
-\page tutorial_Pointers  Cross-device pointers tutorial
+# Cross-device smart pointers tutorial
 
 [TOC]
 
diff --git a/Documentation/Tutorials/ReductionAndScan/tutorial_ReductionAndScan.md b/Documentation/Tutorials/ReductionAndScan/tutorial_ReductionAndScan.md
index 0c55abd2ef0f8650ecedfac2394c6d15cef42d61..3eb1d1b36e4d5e2b95ad7f7748567653d5dd175f 100644
--- a/Documentation/Tutorials/ReductionAndScan/tutorial_ReductionAndScan.md
+++ b/Documentation/Tutorials/ReductionAndScan/tutorial_ReductionAndScan.md
@@ -1,4 +1,4 @@
-\page tutorial_ReductionAndScan Flexible (parallel) reduction and prefix-sum tutorial
+# Flexible (parallel) reduction and prefix-sum tutorial
 
 [TOC]
 
diff --git a/Documentation/Tutorials/Segments/tutorial_Segments.md b/Documentation/Tutorials/Segments/tutorial_Segments.md
index 619f576824bdef4f7d644e95e4bf77694d81c786..5c7043018b1e9ab8a99db18d05bd6bbd8e29efcc 100644
--- a/Documentation/Tutorials/Segments/tutorial_Segments.md
+++ b/Documentation/Tutorials/Segments/tutorial_Segments.md
@@ -1,4 +1,4 @@
-\page tutorial_Segments  Segments tutorial
+# Segments tutorial
 
 [TOC]
 
diff --git a/Documentation/Tutorials/Solvers/Linear/tutorial_Linear_solvers.md b/Documentation/Tutorials/Solvers/Linear/tutorial_Linear_solvers.md
index 0c5fcb49e9ff3e20ecc69ab68451437bb5a93c12..79f61eef11f08315034ff8718342e1e0200a443d 100644
--- a/Documentation/Tutorials/Solvers/Linear/tutorial_Linear_solvers.md
+++ b/Documentation/Tutorials/Solvers/Linear/tutorial_Linear_solvers.md
@@ -1,8 +1,8 @@
-\page tutorial_Linear_solvers  Linear solvers tutorial
+# Linear solvers tutorial
 
 [TOC]
 
-# Introduction
+## Introduction
 
 Solvers of linear systems are one of the most important algorithms in scientific computations. TNL offers the followiing iterative methods:
 
@@ -28,9 +28,9 @@ The iterative solvers (not the stationary solvers like \ref TNL::Solvers::Linear
    1. [ILU(0)](https://en.wikipedia.org/wiki/Incomplete_LU_factorization) \ref TNL::Solvers::Linear::Preconditioners::ILU0
    2. [ILUT (ILU with thresholding)](https://www-users.cse.umn.edu/~saad/PDF/umsi-92-38.pdf) \ref TNL::Solvers::Linear::Preconditioners::ILUT
 
-# Iterative solvers of linear systems
+## Iterative solvers of linear systems
 
-## Basic setup
+### Basic setup
 
 All iterative solvers for linear systems can be found in the namespace \ref TNL::Solvers::Linear. The following example shows the use the iterative solvers:
 
@@ -58,7 +58,7 @@ The result looks as follows:
 
 \include IterativeLinearSolverExample.out
 
-## Setup with a solver monitor
+### Setup with a solver monitor
 
 Solution of large linear systems may take a lot of time. In such situations, it is useful to be able to monitor the convergence of the solver of the solver status in general. For this purpose, TNL offers solver monitors. The solver monitor prints (or somehow visualizes) the number of iterations, the residue of the current solution approximation or some other metrics. Sometimes such information is printed after each iteration or after every ten iterations. The problem of this approach is the fact that one iteration of the solver may take only few milliseconds but also several minutes. In the former case, the monitor creates overwhelming amount of output which may even slowdown the solver. In the later case, the user waits long time for update of the solver status. The monitor in TNL rather runs in separate thread and it refreshes the status of the solver in preset time periods. The use of the iterative solver monitor is demonstrated in the following example.
 
@@ -80,7 +80,7 @@ The result looks as follows:
 
 \include IterativeLinearSolverWithTimerExample.out
 
-## Setup with preconditioner
+### Setup with preconditioner
 
 Preconditioners of iterative solvers can significantly improve the performance of the solver. In the case of the linear systems, they are used mainly with the Krylov subspace methods. Preconditioners cannot be used with the starionary methods (\ref TNL::Solvers::Linear::Jacobi and \ref TNL::Solvers::Linear::SOR). The following example shows how to setup an iterative solver of linear systems with preconditioning.
 
@@ -92,7 +92,7 @@ The result looks as follows:
 
 \include IterativeLinearSolverWithPreconditionerExample.out
 
-## Choosing the solver and preconditioner type at runtime
+### Choosing the solver and preconditioner type at runtime
 
 When developing a numerical solver, one often has to search for a combination of various methods and algorithms that fit given requirements the best. To make this easier, TNL offers choosing the type of both linear solver and preconditioner at runtime by means of functions \ref TNL::Solvers::getLinearSolver and \ref TNL::Solvers::getPreconditioner. The following example shows how to use these functions:
 
@@ -102,4 +102,4 @@ We still stay with the same problem and the only changes can be seen on lines 66
 
 The result looks as follows:
 
-\include IterativeLinearSolverWithRuntimeTypesExample.out
\ No newline at end of file
+\include IterativeLinearSolverWithRuntimeTypesExample.out
diff --git a/Documentation/Tutorials/Sorting/tutorial_Sorting.md b/Documentation/Tutorials/Sorting/tutorial_Sorting.md
index 4832077cb6b2541925423e6d58ff9345e242751a..c0e6cdc2465786544c794e02f9ed0bfe913e7279 100644
--- a/Documentation/Tutorials/Sorting/tutorial_Sorting.md
+++ b/Documentation/Tutorials/Sorting/tutorial_Sorting.md
@@ -1,4 +1,4 @@
-\page tutorial_Sorting Sorting tutorial
+# Sorting tutorial
 
 [TOC]
 
diff --git a/Documentation/Tutorials/Vectors/tutorial_Vectors.md b/Documentation/Tutorials/Vectors/tutorial_Vectors.md
index 5cd3b60f1ad27a335edbc29288fb2988dccb9ef5..5448522b6688a4c543c09f8d93a41744d7f861a8 100644
--- a/Documentation/Tutorials/Vectors/tutorial_Vectors.md
+++ b/Documentation/Tutorials/Vectors/tutorial_Vectors.md
@@ -1,4 +1,4 @@
-\page tutorial_Vectors  Vectors tutorial
+# Vectors tutorial
 
 [TOC]
 
@@ -14,7 +14,7 @@ This tutorial introduces vectors in TNL. `Vector`, in addition to `Array`, offer
 * `Device` is the device where the vector is allocated. Currently it can be either `Devices::Host` for CPU or `Devices::Cuda` for GPU supporting CUDA.
 * `Index` is the type to be used for indexing the vector elements.
 
-`Vector`, unlike `Array`, requires that the `Real` type is numeric or a type for which basic algebraic operations are defined. What kind of algebraic operations is required depends on what vector operations the user will call. `Vector` is derived from `Array` so it inherits all its methods. In the same way the `Array` has its counterpart `ArraView`, `Vector` has `VectorView` which is derived from `ArrayView`. We refer to to [Arrays tutorial](../../Arrays/html/index.html) for more details.
+`Vector`, unlike `Array`, requires that the `Real` type is numeric or a type for which basic algebraic operations are defined. What kind of algebraic operations is required depends on what vector operations the user will call. `Vector` is derived from `Array` so it inherits all its methods. In the same way the `Array` has its counterpart `ArraView`, `Vector` has `VectorView` which is derived from `ArrayView`. We refer to to [Arrays tutorial](../Arrays/tutorial_Arrays.md) for more details.
 
 ### Horizontal operations
 
diff --git a/Documentation/Tutorials/index.md b/Documentation/Tutorials/index.md
index 900641f1bf42d81181f31c72128265c3545075b2..49795597c2f9c16d776a39c17495d86697560633 100644
--- a/Documentation/Tutorials/index.md
+++ b/Documentation/Tutorials/index.md
@@ -1,15 +1,13 @@
-\page Tutorials
+\page Tutorials Tutorials
 
-## Tutorials
-
-1. [General concepts](tutorial_GeneralConcepts.html)
-2. [Arrays](tutorial_Arrays.html)
-3. [Vectors](tutorial_Vectors.html)
-4. [Flexible parallel reduction and scan](tutorial_ReductionAndScan.html)
-5. [For loops](tutorial_ForLoops.html)
-6. [Sorting](tutorial_Sorting.html)
-7. [Cross-device pointers](tutorial_Pointers.html)
-8. [Matrices](tutorial_Matrices.html)
-9. [Linear solvers](tutorial_Linear_solvers.html)
-10. [Segments aka sparse formats](tutorial_Segments.html)
-11. [Unstructured meshes](tutorial_Meshes.html)
+1. [General concepts](./GeneralConcepts/tutorial_GeneralConcepts.md)
+2. [Arrays](./Arrays/tutorial_Arrays.md)
+3. [Vectors](./Vectors/tutorial_Vectors.md)
+4. [Flexible parallel reduction and scan](./ReductionAndScan/tutorial_ReductionAndScan.md)
+5. [For loops](./ForLoops/tutorial_ForLoops.md)
+6. [Sorting](./Sorting/tutorial_Sorting.md)
+7. [Cross-device pointers](./Pointers/tutorial_Pointers.md)
+8. [Matrices](./Matrices/tutorial_Matrices.md)
+9. [Linear solvers](./Solvers/Linear/tutorial_Linear_solvers.md)
+10. [Segments aka sparse formats](./Segments/tutorial_Segments.md)
+11. [Unstructured meshes](./Meshes/tutorial_Meshes.md)
diff --git a/README.md b/README.md
index 756d80e65ba17a39ab7dc7ef85019f285ede2c1c..6d518319d2164fb52850e582f4b0ae43207d676e 100644
--- a/README.md
+++ b/README.md
@@ -47,11 +47,11 @@ several modules:
   using external libraries (see below).
 
 See also [Comparison with other libraries](
-https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/comparison_with_other_libraries.html).
+https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/md_Pages_comparison_with_other_libraries.html).
 
 [allocators]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/namespaceTNL_1_1Allocators.html
 [devices]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/namespaceTNL_1_1Devices.html
-[core concepts]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/core_concepts.html
+[core concepts]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/md_Pages_core_concepts.html
 [containers]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/namespaceTNL_1_1Containers.html
 [vectors]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/classTNL_1_1Containers_1_1Vector.html
 [matrices]: https://mmg-gitlab.fjfi.cvut.cz/doc/tnl/namespaceTNL_1_1Matrices.html