tnl-dev issueshttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues2018-02-08T14:29:42Zhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/4Update maximum CUDA grid size2018-02-08T14:29:42ZJakub KlinkovskýUpdate maximum CUDA grid sizeAccording to the [documentation](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities), the maximum CUDA grid size is `(2147483647, 65535, 65535)`. So the [getMaxGridSize](https://jlk.fjfi.cvut.cz/gitlab/m...According to the [documentation](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities), the maximum CUDA grid size is `(2147483647, 65535, 65535)`. So the [getMaxGridSize](https://jlk.fjfi.cvut.cz/gitlab/mmg/tnl-dev/blob/develop/src/TNL/Devices/Cuda_impl.h#L22) method should should be updated to report 2147483647 for the _x_ dimension whenever possible.
For devices with 2.x compute capability, which are unsupported since CUDA 9.0, the maximum grid size was `(65535, 65535, 65535)`.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/45Parallelize segmented prefix-sum with OpenMP2019-08-14T20:39:50ZJakub KlinkovskýParallelize segmented prefix-sum with OpenMPThe implementation of segmented prefix-sum for `Host` is only sequential.The implementation of segmented prefix-sum for `Host` is only sequential.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/75Fix/improve the implementation of mesh entity orientations2021-04-11T08:03:13ZJakub KlinkovskýFix/improve the implementation of mesh entity orientationsCurrently it is untested and inefficient because it is based on storing the whole subvertex permutations for each entity. We need to better understand what information is needed for FVM and improve the implementation.Currently it is untested and inefficient because it is based on storing the whole subvertex permutations for each entity. We need to better understand what information is needed for FVM and improve the implementation.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/91Segments: "compute" parameter is not checked always2021-09-28T19:24:14ZJakub KlinkovskýSegments: "compute" parameter is not checked always- BiEllpack: `compute` seems to be checked correctly
- ChunkedEllpack: `compute` seems to be checked correctly
- Ellpack: `compute` is checked only in the general cases ([1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/ma...- BiEllpack: `compute` seems to be checked correctly
- ChunkedEllpack: `compute` seems to be checked correctly
- Ellpack: `compute` is checked only in the general cases ([1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L410), [2](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L427)), but not in the CUDA specializations ([3](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L44-46), [4](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/EllpackView.hpp#L79-81))
- SlicedEllpack: `compute` is not checked at all: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/SlicedEllpackView.hpp#L347-349), [2](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/SlicedEllpackView.hpp#L364-366)
- CSR:
- Adaptive: `compute` is not checked: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRAdaptiveKernelView.hpp#L83-125)
- Hybrid: `compute` is checked in the [multivector kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRHybridKernel.hpp#L112), but not in the [hybrid kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRHybridKernel.hpp#L52-55)
- Light: `compute` is checked in the [multivector kernel](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRLightKernel.hpp#L316-320), but not in the [other kernels](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRLightKernel.hpp#L50-252)
- Scalar: `compute` seems to be checked correctly
- Vector: `compute` is not checked: [1](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/blob/TO/matrices-adaptive-csr/src/TNL/Algorithms/Segments/Kernels/CSRVectorKernel.hpp#L57-61)
Obviously we don't have any tests for this feature. But do we have some benchmark which proves that this optimization helps in some cases?Tomáš OberhuberTomáš Oberhuber