Commits · 33d17aab9afc763f7d316d41820b034df1dc855e · TNL / tnl-dev

Nov 26, 2019

Copied tnl-benchmark-spmv files and spmv.h from BLAS to SpMV. Deleted min/max... · 33d17aab

Lukas Cejka authored 6 years ago

Copied tnl-benchmark-spmv files and spmv.h from BLAS to SpMV. Deleted min/max size and stepFactor. Not working yet, backup purposes.

33d17aab

Nov 08, 2019
- Fixed internal linkage of the getHardwareMetadata function in benchmarks · e8cc0880
  Jakub Klinkovský authored 5 years ago
  
  e8cc0880
- Renamed prefixSum methods to scan · afba52d9
  Jakub Klinkovský authored 5 years ago
  
  Closes #49
  afba52d9
- Removed HostType and CudaType aliases in containers, matrices and grids · d070cc39
  Jakub Klinkovský authored 5 years ago
  
  They are not suitable for more than 2 devices/execution types; their design breaks the Open-Closed Principle. Instead, a type template "Self" was created, which allows to change any template parameter.
  d070cc39
- Removed useless typedefs such as ThisType · 3a997233
  Jakub Klinkovský authored 5 years ago
  
  3a997233
Oct 25, 2019

Moved algorithms from TNL/Containers/Algorithms/ to just TNL/Algorithms/ · 399f9627

Jakub Klinkovský authored 5 years ago

The usage of algorithms such as MemoryOperations or Reduction is not
bound to a particular container. On the other hand, ArrayIO,
ArrayAssignment, VectorAssignment and StaticArrayAssignment are just
implementation details for the containers - moved into
TNL/Containers/detail/

Also moved ParallelFor, StaticFor, StaticVectorFor, TemplateStaticFor
into TNL/Algorithms/

399f9627

Benchmarks: added benchmarks for array copy and compare using memcpy and memcmp · 7a5840de
Jakub Klinkovský authored 5 years ago

7a5840de
Moved SystemInfo class out of the Devices namespace · dacc1711
Jakub Klinkovský authored 5 years ago
```
It has nothing to do with devices.
```
dacc1711

Moved synchronization of smart pointers from Devices::Cuda into TNL::Pointers... · 1743358a

Jakub Klinkovský authored 5 years ago

Moved synchronization of smart pointers from Devices::Cuda into TNL::Pointers namespace as free functions

synchronizeDevice() was renamed to synchronizeSmartPointersOnDevice()
for clarity - there are many similarly named functions in CUDA (e.g.
cudaDeviceSynchronize()).

1743358a

Moved (most of) static methods from TNL::Devices::Cuda as free functions into... · 2d5176fb

Jakub Klinkovský authored 5 years ago

Moved (most of) static methods from TNL::Devices::Cuda as free functions into separate namespace TNL::Cuda

The class TNL::Devices::Cuda was too bloated, breaking the Single
Responsibility Principle. It should be used only for template
specializations and other things common to all devices.

The functions in MemoryHelpers.h are deprecated, smart pointers should
be used instead.

The functions in LaunchHelpers.h are temporary, more refactoring is
needed with respect to execution policies and custom launch parameters.

2d5176fb

Oct 24, 2019
- Reimplemented getType() function using typeid operator and removed useless getType() methods · 5910a5e8
  Jakub Klinkovský authored 5 years ago
  
  Fixes #46
  5910a5e8
- Removed MIC support · e7880461
  Jakub Klinkovský authored 5 years ago
  
  e7880461
Sep 03, 2019
- Cleanup · 78d15fb0
  Jakub Klinkovský authored 5 years ago
  
  78d15fb0
Sep 02, 2019
- Renaming PrefixSum to Scan. · 92dc4a47
  Tomáš Oberhuber authored 5 years ago
  
  92dc4a47
Aug 27, 2019
- Avoiding compiler warnings for builds without CUDA · 7390a03b
  Jakub Klinkovský authored 5 years ago
  
  7390a03b
Aug 24, 2019
- Avoiding compiler warnings · 8253355f
  Jakub Klinkovský authored 5 years ago
  
  8253355f
Aug 17, 2019
- Added prefix-sum to BLAS benchmarks · 27631930
  Jakub Klinkovský authored 5 years ago
  
  27631930
- Benchmarks: compute sample standard deviation of the measured computation times · 2bea9311
  Jakub Klinkovský authored 5 years ago
  
  2bea9311
- Removed timing parameter from benchmarks · e6e6cf46
  Jakub Klinkovský authored 5 years ago
  
  Benchmarks can be easily profiled even without this parameter, so it was just an unnecessary complication.
  e6e6cf46
- Benchmarks: added scalar multiplication with BLAS · 232be124
  Jakub Klinkovský authored 5 years ago
  
  232be124
- Ugly workaround for nvcc's stupid modification of `new` expressions · 32c69a11
  Jakub Klinkovský authored 5 years ago
  
  32c69a11
- Replaced custom lambda functions with instances of STL types where possible · 0a57393f
  Jakub Klinkovský authored 5 years ago
  
  0a57393f
- Changed reduction operation to use functions with `return a + b` instead of `a += b` · e20a0930
  Jakub Klinkovský authored 5 years ago
  
  This is nicer because it more clearly separates data load, computation and data store. Furthermore, it allows to use instances of std::plus, std::logical_and, std::logical_or, etc. instead of custom lambda functions.
  e20a0930
- Removed VectorOperations class which is now useless · d0fc1bb7
  Jakub Klinkovský authored 5 years ago
  
  It contained only methods for prefixSum and segmentedPrefixSum, which were identical for Host and Cuda, so they can be easily implemented directly in Vector and VectorView.
  d0fc1bb7
- Removed volatile reduction completely · 13b89a71
  Jakub Klinkovský authored 5 years ago
  
  13b89a71
Aug 14, 2019
- Removed conditional per-device Permutation and SliceInfo setting from NDArray and SlicedNDArray · 55ded6ad
  Jakub Klinkovský authored 5 years ago
  
  55ded6ad
- NDArray: added forBoundary method · 6c8c608e
  Jakub Klinkovský authored 6 years ago
  
  6c8c608e
- Added NDArray · e444116e
  Jakub Klinkovský authored 6 years ago
  
  e444116e
Aug 09, 2019
- Removed method sum from all vector types · d6259323
  Jakub Klinkovský authored 5 years ago
  
  d6259323
Aug 07, 2019
- BLAS benchmark: moved addVector and addVectors to VectorOperations in the Benchmarks namespace · 5640cc0f
  Jakub Klinkovský authored 5 years ago
  
  5640cc0f
- Replaced addVector(s) and scalarProduct with expression templates · f22208f3
  Jakub Klinkovský authored 5 years ago
  
  f22208f3
Aug 05, 2019
- Renamed getLocalArrayView and getLocalVectorView to simply getLocalView · 16548b66
  Jakub Klinkovský authored 5 years ago
  
  16548b66
Jul 31, 2019
- One more fix of BLAS detection. · fa802d5b
  Tomáš Oberhuber authored 5 years ago
  
  fa802d5b
Jul 30, 2019
- Fixed BLAS detection using cmake module. · 146cb9aa
  Tomáš Oberhuber authored 5 years ago
  
  146cb9aa
Jul 27, 2019
- Renaming StaticFor to TemplateStaticFor. · 1ba3ac51
  Tomáš Oberhuber authored 5 years ago
  
  1ba3ac51
Jul 26, 2019
- Added template parameter Allocator to Vector · 17c9ad67
  Jakub Klinkovský authored 5 years ago
  
  17c9ad67
- Added triad benchmark (copy to device, compute, copy to host) using different... · 2a98843d
  Jakub Klinkovský authored 5 years ago
  
  Added triad benchmark (copy to device, compute, copy to host) using different memory management strategies
  2a98843d
- Added benchmarks for array operations using different host allocators · 62ca0c97
  Jakub Klinkovský authored 5 years ago
  
  62ca0c97
Jul 25, 2019
- Fixed compiler flags for the BLAS benchmark · 1849adf2
  Jakub Klinkovský authored 5 years ago
  
  1849adf2
Jul 15, 2019
- Fixed namespaces of vectors and expressions operators. · 35a040ab
  Tomáš Oberhuber authored 5 years ago
  
  35a040ab