tnl-dev issueshttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues2020-05-10T17:48:24Zhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/1Missing tests2020-05-10T17:48:24ZJakub KlinkovskýMissing tests- [ ] smart pointers
- [x] `Containers::Algorithms`
- [ ] `Grid`
- [ ] `MeshFunction`, `VectorField`- [ ] smart pointers
- [x] `Containers::Algorithms`
- [ ] `Grid`
- [ ] `MeshFunction`, `VectorField`https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/2Python-like string methods2021-04-13T19:33:38ZJakub KlinkovskýPython-like string methodsThe `TNL::String` class should be expanded to include methods inspired by the [Python `str` class](https://docs.python.org/3/library/stdtypes.html#string-methods). Without that, there is no reason to keep using `TNL::String` instead of `...The `TNL::String` class should be expanded to include methods inspired by the [Python `str` class](https://docs.python.org/3/library/stdtypes.html#string-methods). Without that, there is no reason to keep using `TNL::String` instead of `std::string`.
Most important methods:
- [x] replace
- [x] strip
- [ ] lstrip
- [ ] rstrip
- [x] startswith
- [x] endswith
- [ ] count
- [ ] find
- [ ] rfind
- [ ] join
- [x] substr (Python does it with the `string[start:end+1]` syntax)https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/4Update maximum CUDA grid size2018-02-08T14:29:42ZJakub KlinkovskýUpdate maximum CUDA grid sizeAccording to the [documentation](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities), the maximum CUDA grid size is `(2147483647, 65535, 65535)`. So the [getMaxGridSize](https://jlk.fjfi.cvut.cz/gitlab/m...According to the [documentation](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities), the maximum CUDA grid size is `(2147483647, 65535, 65535)`. So the [getMaxGridSize](https://jlk.fjfi.cvut.cz/gitlab/mmg/tnl-dev/blob/develop/src/TNL/Devices/Cuda_impl.h#L22) method should should be updated to report 2147483647 for the _x_ dimension whenever possible.
For devices with 2.x compute capability, which are unsupported since CUDA 9.0, the maximum grid size was `(65535, 65535, 65535)`.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/8Sparse matrices todo list2022-01-21T12:55:19ZJakub KlinkovskýSparse matrices todo list- vyjasnit si rvalue references a implementovat move constructor, move assignment operator
- definovat a otestovat (benchmarkem) základní operátory (+, -, *scalar, *vector, ...) pomocí move konstructoru/přiřazení
- zúplnit + otestovat + ...- vyjasnit si rvalue references a implementovat move constructor, move assignment operator
- definovat a otestovat (benchmarkem) základní operátory (+, -, *scalar, *vector, ...) pomocí move konstructoru/přiřazení
- zúplnit + otestovat + benchmarkovat kopírování mezi jednotlivými formáty a různými Device - bylo by hezké, kdyby to šlo udělat pomocí přiřazovacího operátoru (kdysi jsem napsal funkci `copySparseMatrix` - viz `Sparse{,_impl}.h`)
- kernely pro násobení řídkých matic (v ILU se dá struktura pro zaplnění určit pomocí mocnin binární matice - powers of binary matrix (PBM) strategie - mám na to nějaký článek) (a taky se to hodí v multigridu)
- výpočet maticových norem (hodilo by se např. pro testování `||A - LU||` kde LU je neúplný rozklad)
- nějak promyslet `MatrixOperations` (třída pro operace nezávislé na formátu matice)
- smysl je oddělit high-level algoritmy (=matematika) od storage a základních operací
- solvery pro trojúhelníkové matice - sekvenční algoritmy jsou natvrdo v ILU0 (algoritmy pro CUDA jsou ale na samostatné téma...)
- husté matice by měly být použitelné i v CUDA kernelech
- proxy třídy pro submatice (např. diagonální blok - hodí se v distribuovaných výpočtech)
- proxy třídy pro trojúhelníkové submatice (a navíc implicitní 1 na diagonále) - aby se daly ukládat faktory L a U "v jednom" a použít na ně trojúhelníkový solver
- transpozice - jednak proxy, ale i materializovaná
- Pro husté matice se dá zjednodušit přístup k prvkům pomocí `operator()` místo `getElementFast`/`setElementFast` (běžně se to používá v lepších knihovnách). Pro řídké matice by se takto dal zjednodušit read-only přístup, ale asi ne zápis (nelze vrátit referenci na neexistující prvek). I když, aktuální implementace `setElementFast` stejně funguje z kernelu jen když se nemusí měnit sloupcový index, takže by se klidně mohl používat `operator()` i pro přepsání existující hodnoty...
- metoda `getRowLength` vrací číslo včetně padding zeros u ellpack formátů, což je nešikovné - často se hodí znát přesně reálnou délku řádku. Chtělo by to buď ukládat nebo umět rychle spočítat. Rozhodně je to potřeba pro metodu `getCompressedRowLengths`...
- implementovat `MatrixTypeResolver` pro načítání matic ze souboru (analogie k `MeshTypeResolver`)
- bug: některé atributy nejsou inicializované pro matice načtené ze souboru (např. `getMaxRowLength()` vrací 0, protože se nevolala metoda `setCompressedRowLengths`)
- implementovat `tnl-matrix-converter` - nástroj pro konverzi mezi libovolnými maticovými formáty - hodilo by se hlavně pro export do texťáků (gnuplot a mtx)
- implementovat metodu pro seřazení hodnot v každém řádku dle sloupcových indexů (viz https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.sort_indices.html)https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/16Refactors neighbors and periodicNeighbors in DistributedGrid2020-07-29T11:50:11ZTomáš OberhuberRefactors neighbors and periodicNeighbors in DistributedGridneighbors and periodicNeighbors in DistributedGrid should be implemented using the StaticArray.neighbors and periodicNeighbors in DistributedGrid should be implemented using the StaticArray.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/17Update GridTraverser for vertexes to support distributed grid overlaps2020-07-29T11:50:36ZTomáš OberhuberUpdate GridTraverser for vertexes to support distributed grid overlapsTomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/31Use std::swap instead of custom methods2019-08-31T11:16:33ZJakub KlinkovskýUse std::swap instead of custom methods`std::swap` works out-of-the-box for objects that are `MoveAssignable` and `MoveConstructible`, see https://stackoverflow.com/q/39675073. If it does not work out-of-the box, `std::swap` can be overloaded for custom objects. Hence, `std::...`std::swap` works out-of-the-box for objects that are `MoveAssignable` and `MoveConstructible`, see https://stackoverflow.com/q/39675073. If it does not work out-of-the box, `std::swap` can be overloaded for custom objects. Hence, `std::swap` should be preferred instead of custom `swap` methods which do just the trivial thing in most cases.
Since [C++20](https://en.cppreference.com/w/cpp/algorithm/swap) `std::swap` will be `constexpr` so it will be possible to remove even the [TNL::swap](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/blob/a82afc32eeac15982bbb26703de621491eb9e387/src/TNL/Math.h#L159-171) function.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/33Implement segmented prefix-sum in CUDA2019-08-14T20:39:39ZTomáš OberhuberImplement segmented prefix-sum in CUDAhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/38Reimplement StaticVectorExpressions using StaticFor.2019-08-14T18:13:43ZTomáš OberhuberReimplement StaticVectorExpressions using StaticFor.Reimplement StaticVectorExpressions using StaticFor as it is done in StaticArray and StaticVector.Reimplement StaticVectorExpressions using StaticFor as it is done in StaticArray and StaticVector.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/44Extended NDArray operations2021-11-23T19:08:29ZJakub KlinkovskýExtended NDArray operationsAfter !18 the following things remain to be implemented:
## Features
- [ ] implement generic assignment operator
- [x] support any value, device and index types
- [ ] support any permutation
- [ ] support copies to and from...After !18 the following things remain to be implemented:
## Features
- [ ] implement generic assignment operator
- [x] support any value, device and index types
- [ ] support any permutation
- [ ] support copies to and from non-contiguous memory (e.g. subarrays)
- [x] add support for different allocators (c.f. `Array` implementation)
## Applications
- [ ] storage for `VectorField` in TNL
- [ ] subarrays: writing 1D and 2D slices into VTK
## Operations
- [ ] `reduce_along_axis`, `reduce_along_axes` - generalized multireductions - see also https://bitbucket.org/eigen/eigen/src/default/unsupported/Eigen/CXX11/src/Tensor/README.md?at=default&fileviewer=file-view-default#markdown-header-reduction-operations
- [ ] `apply_along_axis` - apply a function to all 1D slices along given axis (challenge: parallelization of outer or inner function)
- Note that unlike [numpy.apply_along_axis](https://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html), the inner function cannot change the array dimension/shape.
- Note that the similar NumPy function, [apply_over_axes](https://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_over_axes.html), is not applicable to NDArray because the slices along different axes have different type so a single function cannot be applied to them. Also, even in NumPy it is interesting only with the change of dimension/shape.
- [ ] reordering of ND arrays along any axis (currently hardcoded in `tnl-mhfem` only for one specific layout of dofs)
- [ ] other [NumPy array manipulation routines](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html) - logical re-shaping and transpose-like operations (i.e. return a view with different sizes or permutation of axes without changing the data)
- [ ] [Eigen geometrical operations on tensors](https://bitbucket.org/eigen/eigen/src/default/unsupported/Eigen/CXX11/src/Tensor/README.md?at=default&fileviewer=file-view-default#markdown-header-geometrical-operations)
## Benchmarks
- [ ] compilation time depending on the number of dimensions, number of ndarrays in code, ...
- [ ] overhead of the indexing calculation for high-dimensional array
- [ ] operations
- [ ] comparison with [RAJA](https://github.com/LLNL/RAJA)
identity perm, set: 1D bandwidth: 9.4718 GB/s, 6D bandwidth: 8.52481 GB/s (9% difference)
identity perm, assign: 1D bandwidth: 11.503 GB/s, 6D bandwidth: 11.0063 GB/s (4.5% loss compared to 1D)
reverse perm, assign: 6D bandwidth: 9.58735 GB/s (13% loss compared to identity 6D)
- [ ] comparison with OpenFOAM - `ScalarField`, `VectorField`, `TensorField`, operations like tensor*vector (locally on mesh cells)Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/45Parallelize segmented prefix-sum with OpenMP2019-08-14T20:39:50ZJakub KlinkovskýParallelize segmented prefix-sum with OpenMPThe implementation of segmented prefix-sum for `Host` is only sequential.The implementation of segmented prefix-sum for `Host` is only sequential.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/51Wrappers for STL compatibility2021-04-13T19:30:42ZTomáš OberhuberWrappers for STL compatibilityMake wrappers for easier porting of STL code to TNL. It means, for example, wrapper for TNL::Array which has the same methods as STL vector and so on.Make wrappers for easier porting of STL code to TNL. It means, for example, wrapper for TNL::Array which has the same methods as STL vector and so on.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/53Bug in analytic functions2019-11-07T13:28:16ZMatouš FenclBug in analytic functionsAnalytic functions as input MeshFunction in Hamilton-Jacobi solver with MPI on GPU give bad values.
Generated function from *tnl-init* is passed into *tnl-direct-eikonal-solver*. TNL devides input MeshFunction into blocks, which number...Analytic functions as input MeshFunction in Hamilton-Jacobi solver with MPI on GPU give bad values.
Generated function from *tnl-init* is passed into *tnl-direct-eikonal-solver*. TNL devides input MeshFunction into blocks, which number depends on number of processes. Devided MeshFunction that we get in file *tnlDirectEikonalProblem_impl.h* in function *Solve()* has invalid values.
Starting script is attached.
[tnl-run-dir-eik-solver](/uploads/f3a60b9e323f9bbecedbf3f23f3f1f9e/tnl-run-dir-eik-solver)Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/56computeCompressedRowLengthsFromMtxFile( ... ) doesn't take account symmetric ...2021-04-13T19:25:53ZMatouš FenclcomputeCompressedRowLengthsFromMtxFile( ... ) doesn't take account symmetric formatFunction computeCompressedRowLengthsFromMtxFile( .. ) doesn't take account symmetric format in matrixReader.h. Computed rowLenghts are too big for ellpackSymmetric.
example:<pre>
/ 1 1 1 1 1 \
| 1 0 0 0 0 |
| 1 0 0 0 0 |
| 1 0 0 0 ...Function computeCompressedRowLengthsFromMtxFile( .. ) doesn't take account symmetric format in matrixReader.h. Computed rowLenghts are too big for ellpackSymmetric.
example:<pre>
/ 1 1 1 1 1 \
| 1 0 0 0 0 |
| 1 0 0 0 0 |
| 1 0 0 0 0 |
\ 1 0 0 0 0 /
</pre>
non-symmetric rowLenghts = 5;
symmetric rowLenghts = 1;
Function should compute only under-diagonal rowLenghts.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/63Loading of the matrix circuit5M2020-06-20T09:33:14ZLukáš Matthew ČejkaLoading of the matrix circuit5MThe matrix circuit5M.mtx from the Florida Matrix Market takes days to load for some reason.
Further investigation is needed.The matrix circuit5M.mtx from the Florida Matrix Market takes days to load for some reason.
Further investigation is needed.Lukáš Matthew ČejkaLukáš Matthew Čejkahttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/68Remove template parameter `isBinary_` from SparseMatrixRowView2021-04-13T19:27:05ZJakub KlinkovskýRemove template parameter `isBinary_` from SparseMatrixRowViewThe method `SparseMatrixRowView::isBinary()` can be implemented by checking if `RealType` is `bool`, the same way as in `SparseMatrix`.The method `SparseMatrixRowView::isBinary()` can be implemented by checking if `RealType` is `bool`, the same way as in `SparseMatrix`.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/74Implement MeshView and maybe DistributedMeshView2020-06-24T12:30:48ZJakub KlinkovskýImplement MeshView and maybe DistributedMeshViewThere should be support for sub-configurations, so that light views could be initialized by a full mesh, e.g. a view including only cells, vertices and links from cells to their subvertices. The same can be done with `Mesh`'s copy-constr...There should be support for sub-configurations, so that light views could be initialized by a full mesh, e.g. a view including only cells, vertices and links from cells to their subvertices. The same can be done with `Mesh`'s copy-constructor.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/75Fix/improve the implementation of mesh entity orientations2021-04-11T08:03:13ZJakub KlinkovskýFix/improve the implementation of mesh entity orientationsCurrently it is untested and inefficient because it is based on storing the whole subvertex permutations for each entity. We need to better understand what information is needed for FVM and improve the implementation.Currently it is untested and inefficient because it is based on storing the whole subvertex permutations for each entity. We need to better understand what information is needed for FVM and improve the implementation.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/76Grids todo list2021-09-28T19:58:15ZJakub KlinkovskýGrids todo listContinuing #52...
- [x] use `.vti` files for the storage of grids, drop `TNLReader`
- [ ] `getEntityIndex()` should be removed from grid - users should call `entity.getIndex()`
- [ ] `getEntity()` and `getEntitiesCount()` should have `i...Continuing #52...
- [x] use `.vti` files for the storage of grids, drop `TNLReader`
- [ ] `getEntityIndex()` should be removed from grid - users should call `entity.getIndex()`
- [ ] `getEntity()` and `getEntitiesCount()` should have `int Dimension` template parameter
- [ ] `isBoundaryEntity()` should be moved from `GridEntity` to `Grid` - it is not only an entity attribute, it is always bound to the particular mesh.
Generally, entities might be shared between multiple submeshes, so the method does not make sense in the general interface.
There might also be read-only views for partitions of the mesh (see vienna-grid).
- <s>[ ] the `getMesh()` method should be removed from `GridEntity` - for the same reason as `isBoundaryEntity()`
(it is also an optimization because the size of the entity structure will be smaller)</s>
- [ ] `getCenter()` and `getMeasure()` should be plain functions taking a `Mesh` and `MeshEntity` as parameters.
This is because general MeshEntity stores only indices of the subvertices, the points have to be accessed via Mesh class.
See also [Effective C++ Item 23 Prefer non-member non-friend functions to member functions](https://stackoverflow.com/questions/5989734/effective-c-item-23-prefer-non-member-non-friend-functions-to-member-functions) and the [Open-closed principle](https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle).https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/79StaticComparison bug2021-12-09T14:26:52ZTomáš JakubecStaticComparison bugTesting the following code:
```c++
#include <iostream>
#include <GTMesh/Debug/Debug.h>
#include <TNL/Containers/StaticVector.h>
#include <TNL/Containers/Vector.h>
using namespace std;
template<int Dim, typename Real>
struct std::numeric_...Testing the following code:
```c++
#include <iostream>
#include <GTMesh/Debug/Debug.h>
#include <TNL/Containers/StaticVector.h>
#include <TNL/Containers/Vector.h>
using namespace std;
template<int Dim, typename Real>
struct std::numeric_limits<TNL::Containers::StaticVector<Dim, Real>>{
static constexpr bool is_specialized = true;
static TNL::Containers::StaticVector<Dim, Real> max(){
TNL::Containers::StaticVector<Dim, Real> res;
res = std::numeric_limits<Real>::max();
return res;
}
static TNL::Containers::StaticVector<Dim, Real> lowest(){
TNL::Containers::StaticVector<Dim, Real> res;
res = std::numeric_limits<Real>::lowest();
return res;
}
};
using namespace TNL;
int main()
{
TNL::Containers::Vector<TNL::Containers::StaticVector<3,int>, TNL::Devices::Host, size_t> a;
a.setSize(2);
a[0] = TNL::Containers::StaticVector<3,int>{5,-3,6};
a[1] = TNL::Containers::StaticVector<3,int>{8, 1, -5};
TNL::Containers::StaticVector<3,int> a1 = a[0];
TNL::Containers::StaticVector<3,int> a2 = a[1];
DBGVAR(a); // == ..\lookup_problem\main.cpp << 36 >> [[ a ]] ==> [ [ 5, -3, 6 ], [ 8, 1, -5 ] ]
DBGVAR((a1 < a2), (a2 < a1)); // == ..\lookup_problem\main.cpp << 37 >> [[ (a1 < a2) ]] ==> false
// == ..\lookup_problem\main.cpp << 37 >> [[ (a2 < a1) ]] ==> false
DBGVAR(min(a)); // == ..\lookup_problem\main.cpp << 39 >> [[ min(a) ]] ==> [ 5, -3, -5 ]
DBGVAR(min(a1,a2), min(a2,a1));// == ..\lookup_problem\main.cpp << 40 >> [[ min(a1,a2) ]] ==> [ 5, -3, 6 ]
// == ..\lookup_problem\main.cpp << 40 >> [[ min(a2,a1) ]] ==> [ 8, 1, -5 ]
DBGVAR(TNL::min(a1, a2)); // == ..\lookup_problem\main.cpp << 42 >> [[ TNL::min(a1, a2) ]] ==> [ 5, -3, -5 ]
return 0;
}
```
The first problem is the comparison of `a1` and `a2`. If the `(a1 < a2)` is false the `(a2 < a1)` must be true. However in both cases, the result is false, which is incorrect. The comparison utilizes the `StaticCompare::LT` defined as:
```c++
__cuda_callable__
static bool LT( const T1& a, const T2& b )
{
TNL_ASSERT_EQ( a.getSize(), b.getSize(), "Sizes of expressions to be compared do not fit." );
for( int i = 0; i < a.getSize(); i++ )
if( ! (a[ i ] < b[ i ]) )
return false;
return true;
}
```
This function does not realize a suitable comparison, e.g., lexicographical.
Secondly, there is a difference in call of min. Both `min(a)` and `TNL::min(a1, a2)` utilize `StaticBinaryExpressionTemplate< ET1, ET2, Min >`, which results in retuning a vector with minimum in each element separately (which is awesome). However, `min(a1, a2)` `min` from stl is called (the `std::min` is prioritized over `TNL::Containers::Expressions::min`) and it employs `StaticCompare::LT` through `operator<`. This problem is solved by removing `using namespace std` (which is partialy my mistake, but worth mentioning). The incorrect implementation of `StaticCompare::LT` results in dependency of the result on the order of the arguments of `min`.
The macro `DBGVAR` is defined in the [GTMesh library](https://mmg-gitlab.fjfi.cvut.cz/gitlab/jakubec/GTMesh).Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/80Silence of unused variable warning2020-07-12T10:22:44ZTomáš JakubecSilence of unused variable warningIt would be nice to silence the warnings caused by unused variables. There are several ways to do it:
1. do not name the unused argument (not preferable),
2. use [[maybe_unused]] (since C++17) in function definition,
3. typecast the unus...It would be nice to silence the warnings caused by unused variables. There are several ways to do it:
1. do not name the unused argument (not preferable),
2. use [[maybe_unused]] (since C++17) in function definition,
3. typecast the unused variable (void) var. This approach is used in Qt.
For example consider the following function:
```c++
static std::string name( CudaStatusType error_code )
{
#ifdef HAVE_CUDA
return cudaGetErrorName( error_code );
#else
(void) error_code;
throw CudaSupportMissing();
#endif
}
```
The approach in [Qt](https://doc.qt.io/qt-5/qtglobal.html#Q_UNUSED) discussed [here](https://stackoverflow.com/questions/19576884/does-q-unused-have-any-side-effects) is the following:
```c++
#define UNUSED(var) (void) var
```
Then, the warning can be silenced by `UNUSED(error_code);`.Tomáš JakubecTomáš Jakubechttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/81Fix reorderEntities for DistributedMesh2020-07-29T11:35:11ZJakub KlinkovskýFix reorderEntities for DistributedMeshThe current naïve implementation cannot work - `DistributedMeshSynchronizer` assumes that global indices of local entities are sorted, so we should update the global indices too and exchange the new global indices for ghost entities.The current naïve implementation cannot work - `DistributedMeshSynchronizer` assumes that global indices of local entities are sorted, so we should update the global indices too and exchange the new global indices for ghost entities.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/82VectorOfStaticVectorsTestCuda: unspecified launch failure in the CUDA reducti...2020-10-31T11:17:20ZJakub KlinkovskýVectorOfStaticVectorsTestCuda: unspecified launch failure in the CUDA reduction kernel with StaticVector valuesThere are compiler warnings like
```
[292/351] Building NVCC (Device) object src/UnitTests/Containers/CMakeFiles/VectorOfStaticVectorsTestCuda.dir/VectorOfStaticVectorsTestCuda_generated_VectorOfStaticVectorsTestCuda.cu.o
/tmp/rexe_klink...There are compiler warnings like
```
[292/351] Building NVCC (Device) object src/UnitTests/Containers/CMakeFiles/VectorOfStaticVectorsTestCuda.dir/VectorOfStaticVectorsTestCuda_generated_VectorOfStaticVectorsTestCuda.cu.o
/tmp/rexe_klinkovsky/tnl/src/TNL/Algorithms/CudaReductionKernel.h(60): warning #3126-D: calling a __host__ function from a __host__ __device__ function is not allowed
detected during:
instantiation of "auto TNL::Algorithms::CudaReductionFunctorWrapper(Reduction &&, Arg1 &&, Arg2 &&) [with Reduction=const std::plus<void> &, Arg1=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>> &, Arg2=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>]"
(95): here
instantiation of "void TNL::Algorithms::CudaReductionKernel<blockSize,Result,DataFetcher,Reduction,Index>(Result, DataFetcher, Reduction, Index, Index, Result *) [with blockSize=256, Result=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, DataFetcher=lambda [](int)->TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, Reduction=std::plus<void>, Index=int]"
(512): here
instantiation of "int TNL::Algorithms::CudaReductionKernelLauncher<Index, Result>::launch(Index, Index, const Reduction &, DataFetcher &, const Result &, Result *) [with Index=int, Result=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, DataFetcher=lambda [](int)->TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, Reduction=std::plus<void>]"
(378): here
instantiation of "Result TNL::Algorithms::CudaReductionKernelLauncher<Index, Result>::finish(const Reduction &, const Result &) [with Index=int, Result=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, Reduction=std::plus<void>]"
/tmp/rexe_klinkovsky/tnl/src/TNL/Algorithms/Reduction.hpp(368): here
instantiation of "Result TNL::Algorithms::Reduction<TNL::Devices::Cuda>::reduce(Index, Index, const ReductionOperation &, DataFetcher &, const Result &) [with Index=int, Result=TNL::Containers::Expressions::RemoveET<std::decay_t<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>, TNL::Containers::Expressions::Addition, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>>>, ReductionOperation=std::plus<void>, DataFetcher=lambda [](IndexType)->TNL::Containers::Expressions::StaticBinaryExpressionTemplate<TNL::Containers::StaticVector<3, int>, TNL::Containers::StaticVector<3, short>, TNL::Containers::Expressions::Multiplication, TNL::Containers::Expressions::VectorExpressionVariable, TNL::Containers::Expressions::VectorExpressionVariable>]"
/tmp/rexe_klinkovsky/tnl/src/TNL/Containers/Expressions/VerticalOperations.h(122): here
[ 7 instantiation contexts not shown ]
implicit generation of "testing::internal::TestFactoryImpl<TestClass>::~TestFactoryImpl() [with TestClass=binary_tests::VectorBinaryOperationsTest_scalarProduct_Test<binary_tests::Pair<TNL::Containers::Vector<TNL::Containers::StaticVector<3, int>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, int>>>, TNL::Containers::Vector<TNL::Containers::StaticVector<3, short>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, short>>>>>]"
/tmp/rexe_klinkovsky/tnl/Release/googletest-src/googletest/include/gtest/internal/gtest-internal.h(742): here
instantiation of class "testing::internal::TestFactoryImpl<TestClass> [with TestClass=binary_tests::VectorBinaryOperationsTest_scalarProduct_Test<binary_tests::Pair<TNL::Containers::Vector<TNL::Containers::StaticVector<3, int>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, int>>>, TNL::Containers::Vector<TNL::Containers::StaticVector<3, short>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, short>>>>>]"
/tmp/rexe_klinkovsky/tnl/Release/googletest-src/googletest/include/gtest/internal/gtest-internal.h(742): here
implicit generation of "testing::internal::TestFactoryImpl<TestClass>::TestFactoryImpl() [with TestClass=binary_tests::VectorBinaryOperationsTest_scalarProduct_Test<binary_tests::Pair<TNL::Containers::Vector<TNL::Containers::StaticVector<3, int>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, int>>>, TNL::Containers::Vector<TNL::Containers::StaticVector<3, short>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, short>>>>>]"
/tmp/rexe_klinkovsky/tnl/Release/googletest-src/googletest/include/gtest/internal/gtest-internal.h(742): here
instantiation of class "testing::internal::TestFactoryImpl<TestClass> [with TestClass=binary_tests::VectorBinaryOperationsTest_scalarProduct_Test<binary_tests::Pair<TNL::Containers::Vector<TNL::Containers::StaticVector<3, int>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, int>>>, TNL::Containers::Vector<TNL::Containers::StaticVector<3, short>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, short>>>>>]"
/tmp/rexe_klinkovsky/tnl/Release/googletest-src/googletest/include/gtest/internal/gtest-internal.h(742): here
instantiation of "__nv_bool testing::internal::TypeParameterizedTest<Fixture, TestSel, Types>::Register(const char *, const testing::internal::CodeLocation &, const char *, const char *, int, const std::vector<std::string, std::allocator<std::string>> &) [with Fixture=binary_tests::VectorBinaryOperationsTest, TestSel=testing::internal::TemplateSel<binary_tests::VectorBinaryOperationsTest_scalarProduct_Test>, Types=binary_tests::gtest_type_params_VectorBinaryOperationsTest_]"
/tmp/rexe_klinkovsky/tnl/src/UnitTests/Containers/VectorBinaryOperationsTest.h(618): here
```
And when the test is executed, it fails with
```
1/95 Test #32: VectorOfStaticVectorsTestCuda .................Child aborted***Exception: 4.78 sec
[==========] Running 130 tests from 8 test suites.
[----------] Global test environment set-up.
[----------] 19 tests from VectorBinaryOperationsTest/0, where TypeParam = binary_tests::Pair<TNL::Containers::Vector<TNL::Containers::StaticVector<3, int>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, int> > >, TNL::Containers::Vector<TNL::Containers::StaticVector<3, short>, TNL::Devices::Cuda, int, TNL::Allocators::Cuda<TNL::Containers::StaticVector<3, short> > > >
[ RUN ] VectorBinaryOperationsTest/0.EQ
[ OK ] VectorBinaryOperationsTest/0.EQ (4029 ms)
[ RUN ] VectorBinaryOperationsTest/0.NE
[ OK ] VectorBinaryOperationsTest/0.NE (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.LT
[ OK ] VectorBinaryOperationsTest/0.LT (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.GT
[ OK ] VectorBinaryOperationsTest/0.GT (1 ms)
[ RUN ] VectorBinaryOperationsTest/0.LE
[ OK ] VectorBinaryOperationsTest/0.LE (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.GE
[ OK ] VectorBinaryOperationsTest/0.GE (1 ms)
[ RUN ] VectorBinaryOperationsTest/0.addition
[ OK ] VectorBinaryOperationsTest/0.addition (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.subtraction
[ OK ] VectorBinaryOperationsTest/0.subtraction (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.multiplication
[ OK ] VectorBinaryOperationsTest/0.multiplication (1 ms)
[ RUN ] VectorBinaryOperationsTest/0.division
[ OK ] VectorBinaryOperationsTest/0.division (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.assignment
[ OK ] VectorBinaryOperationsTest/0.assignment (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.add_assignment
[ OK ] VectorBinaryOperationsTest/0.add_assignment (1 ms)
[ RUN ] VectorBinaryOperationsTest/0.subtract_assignment
[ OK ] VectorBinaryOperationsTest/0.subtract_assignment (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.multiply_assignment
[ OK ] VectorBinaryOperationsTest/0.multiply_assignment (1 ms)
[ RUN ] VectorBinaryOperationsTest/0.divide_assignment
[ OK ] VectorBinaryOperationsTest/0.divide_assignment (0 ms)
[ RUN ] VectorBinaryOperationsTest/0.scalarProduct
terminate called after throwing an instance of 'TNL::Exceptions::CudaRuntimeError'
what(): CUDA ERROR 719 (cudaErrorLaunchFailure): unspecified launch failure.
Source: line 81 in /tmp/rexe_klinkovsky/tnl/src/TNL/Allocators/Cuda.h: unspecified launch failure
```Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/84Assertions for correct definitions of lambda functions2021-02-04T21:27:49ZTomáš OberhuberAssertions for correct definitions of lambda functionsBased on the following concept
https://en.cppreference.com/w/cpp/types/is_invocable (C++-17)
https://www.boost.org/doc/libs/develop/libs/callable_traits/doc/html/callable_traits/reference.html#callable_traits.reference.ref_is_invocabl...Based on the following concept
https://en.cppreference.com/w/cpp/types/is_invocable (C++-17)
https://www.boost.org/doc/libs/develop/libs/callable_traits/doc/html/callable_traits/reference.html#callable_traits.reference.ref_is_invocable (boost in C++-11)
methods accepting lambda fucntions should check them by static_assert if they have correct definition of parameters.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/85Assignment of symmetric and general sparse matrices does not work2021-02-04T21:26:42ZTomáš OberhuberAssignment of symmetric and general sparse matrices does not workOnly lower part and diagonal of the symmetric matrix is assigned to the general one.Only lower part and diagonal of the symmetric matrix is assigned to the general one.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/88SegmentsPrinter::print does not work on GPU2021-06-06T09:12:52ZTomáš OberhuberSegmentsPrinter::print does not work on GPUThe lambda function `fetch` in `SegmentsPrinter::print` causes CUDA kernel crash when it is called (`SegmentsPrinting.h:76`). It is not handled properly probably by the `SegmentsPrinter`. The same lambda function works well in function `...The lambda function `fetch` in `SegmentsPrinter::print` causes CUDA kernel crash when it is called (`SegmentsPrinting.h:76`). It is not handled properly probably by the `SegmentsPrinter`. The same lambda function works well in function `printSegments` (`SegmentsPrinting.h:121`). This can be tested for example using `Examples/Algorithms/Segments/SegmentsExample_General.cu` by replacing (line 39)
```
printSegments( segments, fetch, std::cout )
```
with
```
std::cout << segments.print( fetch ) << std::endl;
```Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/89Sorting TODO2021-07-19T07:44:19ZTomáš OberhuberSorting TODOTODO for Quicksort:
- [ ] refactoring the code of implementation
- [ ] there is a fixed type `int` instead of `Index`
- [ ] I have splitted the code between `Quicksort` and `Quicksorter` because Quicksorter cannot have static metod ...TODO for Quicksort:
- [ ] refactoring the code of implementation
- [ ] there is a fixed type `int` instead of `Index`
- [ ] I have splitted the code between `Quicksort` and `Quicksorter` because Quicksorter cannot have static metod `sort` - `Value` and `Device` must be types of the class not the method `sort`. If we can refactor `Quicksorter` so that it is easier to use, we can merge `Quicksort` and `Quicksorter` togther.
- [ ] `TNL/Algorithms/Sorting/details` contains several files with helper functions and kernels for Quicksort like `quicksort_kernel.h`, `quicksort_1Block.h`, `reduction.h` etc. We should check them for duplicities and rename them or merge them.
- [ ] implementation for devices `Sequential` () and `Host` ( or even `Rocm` )
TODO for bitonic sort:
- [ ] refactoring
- [ ] port for Rocm
General TODO:
- [ ] we do not have inplace sort that could be used with lambda functions for `Host`
- [ ] refactoring and extension of unit tests
- [ ] fetch CUDA samples from git repo - https://github.com/NVIDIA/cuda-samples - instead of relying on local installationhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/90Refactor contains and containsOnlyValue2021-08-02T12:51:12ZJakub KlinkovskýRefactor contains and containsOnlyValue- `contains` should be changed to `find` (c.f. [std::find](https://en.cppreference.com/w/cpp/algorithm/find))
- `containsOnlyValue` should be generalized to have similar interface like `std::all_of`, `std::any_of`, `std::none_of`: https:...- `contains` should be changed to `find` (c.f. [std::find](https://en.cppreference.com/w/cpp/algorithm/find))
- `containsOnlyValue` should be generalized to have similar interface like `std::all_of`, `std::any_of`, `std::none_of`: https://en.cppreference.com/w/cpp/algorithm/all_any_none_of
- maybe we should add `operator==` for comparing an array/vector with a scalar/ValueTypehttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/91Segments: "compute" parameter is not checked always2021-09-28T19:24:14ZJakub KlinkovskýSegments: "compute" parameter is not checked always- BiEllpack: `compute` seems to be checked correctly
- ChunkedEllpack: `compute` seems to be checked correctly
- Ellpack: `compute` is checked only in the general cases ([1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/ma...- BiEllpack: `compute` seems to be checked correctly
- ChunkedEllpack: `compute` seems to be checked correctly
- Ellpack: `compute` is checked only in the general cases ([1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L410), [2](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L427)), but not in the CUDA specializations ([3](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L44-46), [4](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L79-81))
- SlicedEllpack: `compute` is not checked at all: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/SlicedEllpackView.hpp#L347-349), [2](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/SlicedEllpackView.hpp#L364-366)
- CSR:
- Adaptive: `compute` is not checked: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRAdaptiveKernelView.hpp#L83-125)
- Hybrid: `compute` is checked in the [multivector kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRHybridKernel.hpp#L112), but not in the [hybrid kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRHybridKernel.hpp#L52-55)
- Light: `compute` is checked in the [multivector kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRLightKernel.hpp#L316-320), but not in the [other kernels](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRLightKernel.hpp#L50-252)
- Scalar: `compute` seems to be checked correctly
- Vector: `compute` is not checked: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRVectorKernel.hpp#L57-61)
Obviously we don't have any tests for this feature. But do we have some benchmark which proves that this optimization helps in some cases?Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/92Sparse matrices with 64-bit indices for addressing values, but 32-bit storage...2021-09-28T19:29:14ZJakub KlinkovskýSparse matrices with 64-bit indices for addressing values, but 32-bit storage for column indicesE.g. PE has a matrix 471526400 x 471526400 with 3099301688 non-zeros. A 64-bit type is necessary to address the non-zero values and column indices in the global arrays, but we can store the column indices as 32-bit only. This would save ...E.g. PE has a matrix 471526400 x 471526400 with 3099301688 non-zeros. A 64-bit type is necessary to address the non-zero values and column indices in the global arrays, but we can store the column indices as 32-bit only. This would save about 25% of space (`double` + `int` vs `double` + `long int`).
See also https://github.com/pyamg/pyamg/issues/277Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/94Refactor SpMV kernels using CudaBlockReduceShfl::warpReduce2021-09-28T20:35:56ZJakub KlinkovskýRefactor SpMV kernels using CudaBlockReduceShfl::warpReduceVarious SpMV kernels have "inlined" code for parallel reduction across warp, e.g. [EllpackCudaReductionKernelFull](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView...Various SpMV kernels have "inlined" code for parallel reduction across warp, e.g. [EllpackCudaReductionKernelFull](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L48-53). They should call [CudaBlockReduceShfl::warpReduce](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/TNL/Algorithms/detail/CudaReductionKernel.h#L187-203) instead.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/95Fix getSerializationType() methods in segments2021-10-28T20:31:50ZJakub KlinkovskýFix getSerializationType() methods in segmentsThe following discussions from !105 should be addressed:
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1902):
> FIXME
- [ ] @klinkovsky started a [discussion...The following discussions from !105 should be addressed:
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1902):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1903):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1904):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1905):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1906):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1907):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1908):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1909):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1910):
> FIXME
- [ ] @klinkovsky started a [discussion](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/merge_requests/105#note_1911):
> FIXMEhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/97Allocators and smart pointers2021-11-25T19:45:45ZJakub KlinkovskýAllocators and smart pointers- add `Allocator` to smart pointers
(moved from #26)- add `Allocator` to smart pointers
(moved from #26)Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/98Structured logger for solvers2021-11-25T19:46:53ZJakub KlinkovskýStructured logger for solvers- the `TNL::Logger` class should be structured, i.e. like the inverse of the `ParameterContainer`
- then it will be possible to make serialization classes (JSON, yaml or pretty-format) or let it pass to Python
- idea: the `writeProlog` m...- the `TNL::Logger` class should be structured, i.e. like the inverse of the `ParameterContainer`
- then it will be possible to make serialization classes (JSON, yaml or pretty-format) or let it pass to Python
- idea: the `writeProlog` method just serializes `ParameterContainer` into `Logger`, it should be possible to do that automatically
- some logging library like [spdlog](https://github.com/gabime/spdlog) or [Easylogging++](https://github.com/muflihun/easyloggingpp) could be used
(moved from #26)Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/99Configurable SolverMonitor2021-12-09T04:48:09ZJakub KlinkovskýConfigurable SolverMonitorThe `SolverMonitor` class should be more configurable for different types of applications. The interface could be similar to the [BenchmarkResult](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/TNL/Benchmarks/Bench...The `SolverMonitor` class should be more configurable for different types of applications. The interface could be similar to the [BenchmarkResult](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/TNL/Benchmarks/Benchmarks.h#L28-60) class used for configuring benchmarks.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/100JSON log transform script not working2022-02-24T20:37:58ZLukáš Matthew ČejkaJSON log transform script not workingBenchmark logs produced by the [run-tnl-benchmark-spmv](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/run-tnl-benchmark-spmv) script fail to be parsed by the JSON parser script [tnl-spmv-benchma...Benchmark logs produced by the [run-tnl-benchmark-spmv](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/run-tnl-benchmark-spmv) script fail to be parsed by the JSON parser script [tnl-spmv-benchmark-make-tables-json.py](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/tnl-spmv-benchmark-make-tables-json.py) with the following error:
```
Parsing input file....
Traceback (most recent call last):
File "tnl-spmv-benchmark-make-tables-json.py", line 956, in <module>
d = json.load(f)
File "/usr/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 444)
```
**How to reproduce:**
1. If you don't have any matrices set up in the script directory, ten you can briefly run the [get-matrices](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/get-matrices) script to download some into used folder "scripts/mtx_matrices".
2. Run spmv benchmarks using the [run-tnl-benchmark-spmv](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/run-tnl-benchmark-spmv) script.
3. Convert the benchmark JSON logs using the tnl-spmv-benchmark-make-tables-json.py](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Benchmarks/scripts/tnl-spmv-benchmark-make-tables-json.py) script.
**Expected behaviour:**
- The python script will convert the log file containing JSON results of benchmarks to an html file.
**Actual behaviour:**
- The python script fails since the logs are not a valid JSON as a whole, rather, every line is a valid JSON on its own (source: @klinkovsky).
**Notes:**
- Loading the entire JSON from the logs won't work, each line will have to be parsed separately.
For example:
```
data = []
for line in open("sparse-matrix-benchmark.log").readlines():
data.append(json.loads(line))
```
- When working with tables in Python, @klinkovsky recommends to use the Pandas library. Specifically, to load logs to Pandas dataframe, the following function can be used: [link to file](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/develop/src/Python/BenchmarkLogs.py#L40-54).
- Example log file: [sparse-matrix-benchmark.log](/uploads/e1eb86b965fc9a7692d8c3bdd4cf7402/sparse-matrix-benchmark.log).https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/101Update tnl-decompose-mesh for polyhedral meshes2022-07-03T19:38:32ZJakub KlinkovskýUpdate tnl-decompose-mesh for polyhedral meshesWe need to collect also faces on each subdomain and add them to the mesh builder.We need to collect also faces on each subdomain and add them to the mesh builder.Jakub KlinkovskýJakub Klinkovský