- Nov 26, 2019
-
-
Lukas Cejka authored
-
Lukas Cejka authored
Fixed implementation of error reporting. Made all formats be compared with cuSPARSE (format and cuSPARSE compared to CPU, not between each other, that would require Benchmarks.h to be changed). Added name of mtx being tested to MetaDataColumns. Code reformatting.
-
Lukas Cejka authored
Found potential mistake in SpMV/spmv.h where MatrixReader doesn't need to be called twice. Commiting to show in meeting.
-
Lukas Cejka authored
Implemented rough version of result comparison. Implemented benchmark for comparison of TNL CSR and Cusparse on GPU. Edited log file formatting.
-
Lukas Cejka authored
-
Lukas Cejka authored
Implemented for the benchmark to write the output of MatrixReader into the log file. BUG: Every other error message added into the Benchmark doesn't have a '!' as a prefix in the log file.
-
Lukas Cejka authored
-
Lukas Cejka authored
-
Lukáš Matthew Čejka authored
-
Lukas Cejka authored
-
Lukas Cejka authored
-
Lukas Cejka authored
-
Lukas Cejka authored
Copied tnl-benchmark-spmv files and spmv.h from BLAS to SpMV. Deleted min/max size and stepFactor. Not working yet, backup purposes.
-
- Nov 08, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Closes #49
-
Jakub Klinkovský authored
They are not suitable for more than 2 devices/execution types; their design breaks the Open-Closed Principle. Instead, a type template "Self" was created, which allows to change any template parameter.
-
Jakub Klinkovský authored
-
- Oct 25, 2019
-
-
Jakub Klinkovský authored
The usage of algorithms such as MemoryOperations or Reduction is not bound to a particular container. On the other hand, ArrayIO, ArrayAssignment, VectorAssignment and StaticArrayAssignment are just implementation details for the containers - moved into TNL/Containers/detail/ Also moved ParallelFor, StaticFor, StaticVectorFor, TemplateStaticFor into TNL/Algorithms/
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
It has nothing to do with devices.
-
Jakub Klinkovský authored
Moved synchronization of smart pointers from Devices::Cuda into TNL::Pointers namespace as free functions synchronizeDevice() was renamed to synchronizeSmartPointersOnDevice() for clarity - there are many similarly named functions in CUDA (e.g. cudaDeviceSynchronize()).
-
Jakub Klinkovský authored
Moved (most of) static methods from TNL::Devices::Cuda as free functions into separate namespace TNL::Cuda The class TNL::Devices::Cuda was too bloated, breaking the Single Responsibility Principle. It should be used only for template specializations and other things common to all devices. The functions in MemoryHelpers.h are deprecated, smart pointers should be used instead. The functions in LaunchHelpers.h are temporary, more refactoring is needed with respect to execution policies and custom launch parameters.
-
- Oct 24, 2019
-
-
Jakub Klinkovský authored
Fixes #46
-
Jakub Klinkovský authored
-
- Sep 03, 2019
-
-
Jakub Klinkovský authored
-
- Sep 02, 2019
-
-
Tomáš Oberhuber authored
-
- Aug 27, 2019
-
-
Jakub Klinkovský authored
-
- Aug 24, 2019
-
-
Jakub Klinkovský authored
-
- Aug 17, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Benchmarks can be easily profiled even without this parameter, so it was just an unnecessary complication.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
This is nicer because it more clearly separates data load, computation and data store. Furthermore, it allows to use instances of std::plus, std::logical_and, std::logical_or, etc. instead of custom lambda functions.
-
Jakub Klinkovský authored
It contained only methods for prefixSum and segmentedPrefixSum, which were identical for Host and Cuda, so they can be easily implemented directly in Vector and VectorView.
-
Jakub Klinkovský authored
-
- Aug 14, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-