tnl-dev issueshttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues2018-12-14T11:30:17Zhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/21tnl-benchmark-blas produces wrong log2018-12-14T11:30:17ZTomáš Oberhubertnl-benchmark-blas produces wrong logThe log file generated by tnl-benchmark-blas has wrong format in case of SpMV tests.The log file generated by tnl-benchmark-blas has wrong format in case of SpMV tests.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/22Update String documentation2019-03-04T20:49:29ZTomáš OberhuberUpdate String documentationI have added flag skipEmpty to String::split, it needs to be mentioned in the documentation.I have added flag skipEmpty to String::split, it needs to be mentioned in the documentation.Nina DžugasováNina Džugasováhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/23Probably useless parameter in benchmarkArrayOperations2018-12-20T19:51:12ZTomáš OberhuberProbably useless parameter in benchmarkArrayOperationsThe parameter ```loops``` in Benchmarks/array-operations.h:26
```C++
benchmarkArrayOperations( Benchmark & benchmark,
const int & loops,
const long & size )
```
seems to be useless.The parameter ```loops``` in Benchmarks/array-operations.h:26
```C++
benchmarkArrayOperations( Benchmark & benchmark,
const int & loops,
const long & size )
```
seems to be useless.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/24Add scalar multiplicator parameter to vectorProduct in other sparse matrix fo...2020-03-01T09:48:38ZJakub KlinkovskýAdd scalar multiplicator parameter to vectorProduct in other sparse matrix formatsIn https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/commit/30b25f639627af6c9b5055a901a5fc599e2b46b2, the `multiplicator` parameter was added to the `vectorProduct` method in the Ellpack format to be able to compute `outVector = multipl...In https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/commit/30b25f639627af6c9b5055a901a5fc599e2b46b2, the `multiplicator` parameter was added to the `vectorProduct` method in the Ellpack format to be able to compute `outVector = multiplicator * matrix * inVector` in one step. The parameter should be added to other sparse matrix formats as well.Lukáš Matthew ČejkaLukáš Matthew Čejkahttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/25TYPED_TEST_CASE deprecated according to Google Test2019-03-28T08:59:19ZLukáš Matthew ČejkaTYPED_TEST_CASE deprecated according to Google TestUpon compilation of [Unit Tests](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/tree/develop/src/UnitTests/Matrices) that use TYPED_TEST_CASE, Google Test throws a series of warnings stating that TYPED_TEST_CASE is deprecated and sho...Upon compilation of [Unit Tests](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/tree/develop/src/UnitTests/Matrices) that use TYPED_TEST_CASE, Google Test throws a series of warnings stating that TYPED_TEST_CASE is deprecated and should be replaced with TYPE_TEST_SUITE.
Should TYPED_TEST_CASE be replaced with TYPED_TEST_SUITE?Lukáš Matthew ČejkaLukáš Matthew Čejkahttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/26Code revision2021-12-08T23:24:11ZTomáš OberhuberCode revisionTODO list for the code revison:
* renaming _impl.h files to .hpp
* change boolean return type to exception throwing
- [x] fix exception catching by-reference: #29
- [x] use `ASSERT_NO_THROW( map.save( "multimap-test.tnl" ) );` ins...TODO list for the code revison:
* renaming _impl.h files to .hpp
* change boolean return type to exception throwing
- [x] fix exception catching by-reference: #29
- [x] use `ASSERT_NO_THROW( map.save( "multimap-test.tnl" ) );` instead of `ASSERT_TRUE( map.save( "multimap-test.tnl" ) );` etc. in tests (see https://github.com/google/googletest/blob/master/googletest/docs/advanced.md#exception-assertions)
* change boolean flag style parameters into enum class
* use operators << and >> insteal of File.read and File.write where it makes sense
* [x] add exception `NotImplementedError` and use it instead of codes like this:
```cpp
std::cerr << "Type conversion during saving is not implemented for MIC." << std::endl;
abort();
```
Or this:
```cpp
TNL_ASSERT( false, std::cerr << "TODO: implement" );
```
* [x] Documentation: indicate which functions/methods throw which exceptions
* Remove useless types from the public interface of classes:
- [x] `ThisType` - useless for the outside of the class, and the implementation of methods can simply use the class name
* [x] switch to C++14
----
- [x] String
- [x] Timer
- [x] Object
- [x] change bool save and load to void
- [x] File
- [x] Update examples, add data conversion.
- [x] use TransferBufferSize from Devices::Cuda
- [x] Remove `File::Mode` enum, use [std::ios_base::openmode](https://en.cppreference.com/w/cpp/io/ios_base/openmode) directly
- [x] Remove specific exceptions (`ArrayWrongSize`, `MeshFunctionDataMismatch`, `NotTNLFile`, `ObjectTypeDetectionFailure`, `ObjectTypeMismatch`) - all `save`/`load` methods should throw only `FileSerializationError`/`FileDeserializationError`
- [x] FileName
- [x] Array
- [x] Add method for setting array elements using lambda function
- [x] Copy constructor have to make deep copy - DOES NOT WORK with MultiMap yet
- [x] Add copy constructors from `std::list`, `std::vector` and `std::initializer_list`
- [x] Replace bind methods with ArrayView - after refactoring MeshFunction
- [x] Avoid binding in `Array( const Array&, const Index begin, const Index size );`
- [x] Remove `boundLoad` (used only in `Array` and derived objects, loading via `ArrayView` can be used instead: `array.getView().load( file );`)
- [x] Use `operator<<` and `operator>>` instead of `save` and `load` methods
- [x] Delete `operator bool ()`
- [x] ArrayView
- [x] what about `__cuda_callable__` ArrayView assignment?
- [x] Use `operator<<` and `operator>>` instead of `save` and `load` methods
- [x] StaticVector
- [x] Replace all for loops with static loops, i.e. templated for
- [x] u * v should not be dot product but element-wisi multiplication
- [x] use (u,v) for dot product
- [x] Vector
- [x] `Vector` should have the same serialization type as `Array` so that arrays can be loaded into vectors and vice versa
- [x] Delete methods for vector operations which are used in linear solvers - the replacement in the solvers must be tested carefuly
- [x] Implement DistributedVectorExpressions and DistributedVectorViewExpressions
- [x] On some places the method getView() is used because DistributedVectorExpressions are not implemented yet (mainly linear solvers and BLAS benchmark) - all occurences can be deleted later
- [x] Add constructor from expression template
- [x] Remove MultiVector and MultiArray - after refactoring Cameo - summer 2019
- [x] Parallel reduction
- [x] Use auto in lambdas to avoid volatileReduction - it does not work with CUDA 10.0
- [x] Rewrite multi-reduction with lambdas
- [x] ConfigDescription
- [x] ParameterContainer
- allow any types (currently only `int`, `double`, `bool`, `String`)
- allow easy instantiation without the command-line parser (e.g. to pass a dict from python)
- exceptions when accessing unknown parameters
- [x] Logger
- moved to https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/98
- [x] Pointers
- moved to https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/97
- [x] Matrices
- [x] fix getSerializationType the same way as in Array
- [x] rename the method setCompressedRowLengths to setRowCapacities
- [x] implement sparse matrices using Segments
- [x] implement CSR matrix and Ellpack matrix using the Segments, use existing unit tests for debugging
- [ ] implement set/addRow using VectorView and lambda functions
- [x] implement "Lambda matrix" - the element values are given by a lambda function
- [x] implement constructor from initializer list to DenseMatrix
- [x] implement method for setting elements from initializer list to sparse matrix
- [x] implement constructor and method for setting elements from std::map similar to initializer list to sparse matrix
- [ ] update dense matrix multiplication with the new dense matrix implementation
- [ ] update dense matrix transposition with the new dense matrix implementation
- [x] move methods for matrux coloring outside the matrices
- [ ] finish DenseMatrixView::getRowVectorProduct
- [x] implement opertor == for matrices
- [x] Add constructor of matrix views from vector views
- [x] Allocators
- moved to !33https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/27Fix getType for all matrix formats to print in a consistent form2019-03-05T19:12:56ZLukáš Matthew ČejkaFix getType for all matrix formats to print in a consistent form`getType()` is **inconsistent** across different matrix formats.
For example:
* **CSR** `getType()` gives: `Matrices::CSR< double, Devices::Host >`
* **Ellpack** `getType()` gives: `Matrices::Ellpack< double, Devices::Host, int >`
* ...`getType()` is **inconsistent** across different matrix formats.
For example:
* **CSR** `getType()` gives: `Matrices::CSR< double, Devices::Host >`
* **Ellpack** `getType()` gives: `Matrices::Ellpack< double, Devices::Host, int >`
* **Sliced Ellpack** `getType()` gives: `Matrices::SlicedEllpack< double, Devices::Host >`
Is the correct form
`Matrices::FORMAT_NAME< RealType, Devices::DEVICE_TYPE, IndexType >` ?Lukáš Matthew ČejkaLukáš Matthew Čejkahttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/28Saving data from MeshFunction CPU MPI doesn't work for bigger dimension2019-09-23T20:12:02ZMatouš FenclSaving data from MeshFunction CPU MPI doesn't work for bigger dimensionMeshFunctionPointer 3D on CPU with MpiIO doesn't save the right data for big mesh (256^3).
Everything works on CPU for meshes 16^3,32^3,...,128^3. Calculation is fine and saved data are the same as the program (hamilton-jacobi branch) ca...MeshFunctionPointer 3D on CPU with MpiIO doesn't save the right data for big mesh (256^3).
Everything works on CPU for meshes 16^3,32^3,...,128^3. Calculation is fine and saved data are the same as the program (hamilton-jacobi branch) calculates. Only the biggest mesh I use is making me troubles. Those values of 256^3 calculation are fine when I see them in console but those values doesn't save into file. The MPI devides the original mesh by default in "z" direction and I use 4 processes for calculation (2 cores on laptop).
Both calculation and saving values works on GPU with and without MPI!Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/29Exceptions should be caught by reference2019-04-19T10:14:16ZJakub KlinkovskýExceptions should be caught by referenceAll exceptions should be caught by reference, not by value. Bug in https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/commit/5fb2f09ff3baf7ea433a586e779a5abd731a9320:
```
In file included from ../src/Tools/tnl-diff.cpp:11:
../src/Tools/t...All exceptions should be caught by reference, not by value. Bug in https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/commit/5fb2f09ff3baf7ea433a586e779a5abd731a9320:
```
In file included from ../src/Tools/tnl-diff.cpp:11:
../src/Tools/tnl-diff.h: In instantiation of ‘bool processFiles(const TNL::Config::ParameterContainer&) [with Mesh = TNL::Meshes::Grid<1, float, TNL::Devices::{anonymous}::Host, int>]’:
../src/Tools/tnl-diff.h:653:75: required from ‘bool resolveGridIndexType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1; Real = float]’
../src/Tools/tnl-diff.h:665:48: required from ‘bool resolveGridRealType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1]’
../src/Tools/tnl-diff.cpp:70:69: required from here
../src/Tools/tnl-diff.h:630:4: warning: catching polymorphic type ‘class std::ios_base::failure’ by value [-Wcatch-value=]
catch( std::ios_base::failure exception )
^~~~~
../src/Tools/tnl-diff.h: In instantiation of ‘bool processFiles(const TNL::Config::ParameterContainer&) [with Mesh = TNL::Meshes::Grid<1, float, TNL::Devices::{anonymous}::Host, long int>]’:
../src/Tools/tnl-diff.h:655:80: required from ‘bool resolveGridIndexType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1; Real = float]’
../src/Tools/tnl-diff.h:665:48: required from ‘bool resolveGridRealType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1]’
../src/Tools/tnl-diff.cpp:70:69: required from here
../src/Tools/tnl-diff.h:630:4: warning: catching polymorphic type ‘class std::ios_base::failure’ by value [-Wcatch-value=]
../src/Tools/tnl-diff.h: In instantiation of ‘bool processFiles(const TNL::Config::ParameterContainer&) [with Mesh = TNL::Meshes::Grid<1, double, TNL::Devices::{anonymous}::Host, int>]’:
../src/Tools/tnl-diff.h:653:75: required from ‘bool resolveGridIndexType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1; Real = double]’
../src/Tools/tnl-diff.h:667:49: required from ‘bool resolveGridRealType(const std::vector<TNL::String>&, const TNL::Config::ParameterContainer&) [with int Dim = 1]’
../src/Tools/tnl-diff.cpp:70:69: required from here
```
Other places should be checked and revised as well.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/30CUDA version installation2019-03-29T16:48:58ZMatouš FenclCUDA version installationTNL installation failure.
CUDA: 10.1
NVIDIA driver: 418.43
cmake: 3.14.0
gcc: 7.3.0
error:
```
/home/maty/Documents/tnl/src/TNL/Atomic.h: In instantiation of ‘T TNL::Atomic<T, TNL::Devices::Cuda>::load() const [with T = int]’:
/home/ma...TNL installation failure.
CUDA: 10.1
NVIDIA driver: 418.43
cmake: 3.14.0
gcc: 7.3.0
error:
```
/home/maty/Documents/tnl/src/TNL/Atomic.h: In instantiation of ‘T TNL::Atomic<T, TNL::Devices::Cuda>::load() const [with T = int]’:
/home/maty/Documents/tnl/src/TNL/Atomic.h:160:12: required from ‘TNL::Atomic<T, TNL::Devices::Cuda>::operator T() const [with T = int]’
/home/maty/Documents/tnl/src/TNL/Matrices/DistributedSpMV.h:117:25: required from ‘void TNL::Matrices::DistributedSpMV<Matrix, Communicator>::updateCommunicationPattern(const MatrixType&, TNL::Matrices$
/tmp/tmpxft_000024e2_00000000-5_tnl-benchmark-distributed-spmv.cudafe1.stub.c:34:531: required from here
/home/maty/Documents/tnl/src/TNL/Atomic.h:154:17: error: passing ‘const TNL::Atomic<int, TNL::Devices::Cuda>’ as ‘this’ argument discards qualifiers [-fpermissive]
return ((Atomic*)this)->fetch_add( 0 );
~~~~~~~~~^~~
/home/maty/Documents/tnl/src/TNL/Atomic.h:202:3: note: in call to ‘T TNL::Atomic<T, TNL::Devices::Cuda>::fetch_add(T) [with T = int]’
T fetch_add( T arg )
^~~~~~~~~
CMake Error at tnl-benchmark-distributed-spmv-cuda_generated_tnl-benchmark-distributed-spmv.cu.o.Debug.cmake:279 (message):
Error generating file
/home/maty/Documents/tnl/Debug/src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir//./tnl-benchmark-distributed-spmv-cuda_generated_tnl-benchmark-distributed-spmv.cu.o
src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir/build.make:543: recipe for target 'src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir/tnl-benchmark-dist$
make[2]: *** [src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir/tnl-benchmark-distributed-spmv-cuda_generated_tnl-benchmark-distributed-spmv.cu.o] Error 1
CMakeFiles/Makefile2:2676: recipe for target 'src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir/all' failed
make[1]: *** [src/Benchmarks/DistSpMV/CMakeFiles/tnl-benchmark-distributed-spmv-cuda.dir/all] Error 2
```
Originally Atomic.h:153 with same error but code:
return const_cast<Atomic*>(this)->fetch_add( 0 );Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/31Use std::swap instead of custom methods2019-08-31T11:16:33ZJakub KlinkovskýUse std::swap instead of custom methods`std::swap` works out-of-the-box for objects that are `MoveAssignable` and `MoveConstructible`, see https://stackoverflow.com/q/39675073. If it does not work out-of-the box, `std::swap` can be overloaded for custom objects. Hence, `std::...`std::swap` works out-of-the-box for objects that are `MoveAssignable` and `MoveConstructible`, see https://stackoverflow.com/q/39675073. If it does not work out-of-the box, `std::swap` can be overloaded for custom objects. Hence, `std::swap` should be preferred instead of custom `swap` methods which do just the trivial thing in most cases.
Since [C++20](https://en.cppreference.com/w/cpp/algorithm/swap) `std::swap` will be `constexpr` so it will be possible to remove even the [TNL::swap](https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/blob/a82afc32eeac15982bbb26703de621491eb9e387/src/TNL/Math.h#L159-171) function.Jakub KlinkovskýJakub Klinkovskýhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/32CUDA reduction uses the same buffer as input and output array2019-04-15T14:05:36ZTomáš OberhuberCUDA reduction uses the same buffer as input and output arrayIn `TNL/Containers/Algorithms/Reduction_impl.h:139` whel calling `CudaReductionKernelLauncher` we use the same buffer `deviceAux1` as input and output buffer for the reduction. If the CUDA block with index 0 is not the first one to finis...In `TNL/Containers/Algorithms/Reduction_impl.h:139` whel calling `CudaReductionKernelLauncher` we use the same buffer `deviceAux1` as input and output buffer for the reduction. If the CUDA block with index 0 is not the first one to finish its work, its data can be overwritten by other CUDA blocks. We want to avoid allocating of two buffers. Solution might be to increase the size of `deviceAux1` and split into two buffers. We need to check performance when fixing this!Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/33Implement segmented prefix-sum in CUDA2019-08-14T20:39:39ZTomáš OberhuberImplement segmented prefix-sum in CUDAhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/34Fix comparison operator of ArrayView/VectorView in gtest.2019-07-16T13:44:23ZTomáš OberhuberFix comparison operator of ArrayView/VectorView in gtest.Gtest does not accept (cannot compile) `operator==` for `ArrayView` or `VectorView`. `EXPECT_EQ( u, v )` must be replaced with `EXPECT_TRUE( u == v )`. See for example `ArrayViewTest.h`, `assignmentOperator` test.Gtest does not accept (cannot compile) `operator==` for `ArrayView` or `VectorView`. `EXPECT_EQ( u, v )` must be replaced with `EXPECT_TRUE( u == v )`. See for example `ArrayViewTest.h`, `assignmentOperator` test.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/35Overriding of RealType in vertical expressions2019-08-10T16:22:17ZTomáš OberhuberOverriding of RealType in vertical expressionsThe original implementation o vector operations allowed to do this:
```
using VectorType = Containers::Vector< bool, Devices::Host >;
VectorType v( 100 );
v.setValue( true);
auto a = v.sum< int >();
```
The sumation would be performed ...The original implementation o vector operations allowed to do this:
```
using VectorType = Containers::Vector< bool, Devices::Host >;
VectorType v( 100 );
v.setValue( true);
auto a = v.sum< int >();
```
The sumation would be performed in bool, by default, which would not give correct result. We can, however, simply change it to `int`. In the new implementation, we state
```
a = sum( v );
```
instead. I would like to be able to write `a = sum< int >( v )` but I did not find any way how to do it.
Solution might be `a = sum( ( int ) v )`.https://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/36addEntryEnum does not work for list entry.2019-07-14T11:06:39ZTomáš OberhuberaddEntryEnum does not work for list entry.ConfigDescription crashes when one tries to define entry enum values using adEntryEnum for list entry. For example like this:
```
configDescription.addList< String >( "string-list" );
configDescription.eddEntryEnum< String >( "entry" );...ConfigDescription crashes when one tries to define entry enum values using adEntryEnum for list entry. For example like this:
```
configDescription.addList< String >( "string-list" );
configDescription.eddEntryEnum< String >( "entry" );
```
The problem seems to be on ConfigDescription.h:139 where the entry is being cast to EntryType which is std::vector< String > but not String.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/37Add method ParameterContainer::getList2019-07-14T11:06:49ZTomáš OberhuberAdd method ParameterContainer::getListFetching a list of parameters from the ParameterContainer works as follows now:
```
const auto& list = parameters.getParameter< std::vector< String > >( "list" );
```
It should be replaced with
```
const auto& list = parameters.getLis...Fetching a list of parameters from the ParameterContainer works as follows now:
```
const auto& list = parameters.getParameter< std::vector< String > >( "list" );
```
It should be replaced with
```
const auto& list = parameters.getList< String >( "list" );
```Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/38Reimplement StaticVectorExpressions using StaticFor.2019-08-14T18:13:43ZTomáš OberhuberReimplement StaticVectorExpressions using StaticFor.Reimplement StaticVectorExpressions using StaticFor as it is done in StaticArray and StaticVector.Reimplement StaticVectorExpressions using StaticFor as it is done in StaticArray and StaticVector.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/39Add variadic template parameter to Reduction for user defined arguments2019-08-11T18:45:06ZTomáš OberhuberAdd variadic template parameter to Reduction for user defined argumentsAdd variadic template parameter to Reduction for user defined arguments. The arguments would be passed to fetcher.Add variadic template parameter to Reduction for user defined arguments. The arguments would be passed to fetcher.Tomáš OberhuberTomáš Oberhuberhttps://mmg-gitlab.fjfi.cvut.cz/gitlab/tnl/tnl-dev/-/issues/40Optimize scalar product (reduction) on CPU2019-10-12T10:24:07ZTomáš OberhuberOptimize scalar product (reduction) on CPUBenchmarks show that our implementation of scalar product on CPU is very slow.
```
scalar product 400000 CPU 5.4616 0.00109134 N/A
scalar product 400000 CPU ET 4...Benchmarks show that our implementation of scalar product on CPU is very slow.
```
scalar product 400000 CPU 5.4616 0.00109134 N/A
scalar product 400000 CPU ET 4.96865 0.00119962 0.909742
scalar product 400000 CPU BLAS 17.7799 0.000335237 3.25543
```
Since ET (expression templates) and non-ET version behaves almost the same, it seems that even the original implementation before switching to ET was not optimal. We should check the implementation of scalar product in BLAS and improve it.