Commits · 2ebb1334bea73556e99729b600e8bc2b376c32b8 · TNL / tnl-dev

Nov 26, 2019
- Implemented rought version of SpMV Benchmark for mtx files. · 2ebb1334
  Lukas Cejka authored 6 years ago
  
  2ebb1334
- Improved implementation of getNumberOfNonzeroMatrixElements(). · aa627012
  Lukas Cejka authored 6 years ago
  
  aa627012
- Implemented getNumberofNonzeroMatrixElements(). · efef3872
  Lukas Cejka authored 6 years ago
  
  efef3872
- Partial implementation of SpMV benchmark for mtx files. Commiting for backup purposes. · 3fb1c95e
  Lukas Cejka authored 6 years ago
  
  3fb1c95e
- Added useful functions to begind implementation. Commiting for backup purposes. · 0a0c44ca
  Lukas Cejka authored 6 years ago
  
  0a0c44ca
- Copied tnl-benchmark-spmv files and spmv.h from BLAS to SpMV. Deleted min/max... · 33d17aab
  Lukas Cejka authored 6 years ago
  
  Copied tnl-benchmark-spmv files and spmv.h from BLAS to SpMV. Deleted min/max size and stepFactor. Not working yet, backup purposes.
  33d17aab
- Deleted useless old troubleshooting cout statements. · 45509e70
  Lukas Cejka authored 6 years ago
  
  45509e70
- Deleted out-of-date TODO that wasn't for developmental purposes. · 2e2ec2ce
  Lukas Cejka authored 6 years ago
  
  2e2ec2ce
- Deleted useless comments on a solved issue. · 75febe1a
  Lukas Cejka authored 6 years ago
  
  75febe1a
Nov 10, 2019
- Fixed detection of changes in .gitlab-ci.yml · 3271efbb
  Jakub Klinkovský authored 5 years ago
  
  3271efbb
- Removed 'using namespace std;' from documentation examples · fae11032
  Jakub Klinkovský authored 5 years ago
  
  fae11032
- Documentation: load MathJax via https · 827a5eab
  Jakub Klinkovský authored 5 years ago
  
  827a5eab
- Documentation: enable MathJax in Doxyfile · cb2c4343
  Jakub Klinkovský authored 5 years ago
  
  cb2c4343
Nov 08, 2019
- Merge branch 'JK/execution' into 'develop' · 6f736e10
  Jakub Klinkovský authored 5 years ago
  
  Refactoring for execution policies Closes #49, #46, and #11 See merge request !42
  6f736e10
- Moved skipping of synchronization directly into the synchronizeSmartPointersOnDevice function · 9723c16b
  Jakub Klinkovský authored 5 years ago
  
  9723c16b
- Fixed handling of Cuda::getTransferBufferSize() in memory operations · 9615d107
  Jakub Klinkovský authored 5 years ago
  
  9615d107
- Fixed internal linkage of the getHardwareMetadata function in benchmarks · e8cc0880
  Jakub Klinkovský authored 5 years ago
  
  e8cc0880
- Added missing __cuda_callable__ to StaticArray and StaticVector methods · ef4cd475
  Jakub Klinkovský authored 5 years ago
  
  ef4cd475
- Reimplemented mesh traverser using ParallelFor · e202036e
  Jakub Klinkovský authored 5 years ago
  
  e202036e
- Added MeshTraverserTest · 11ba9c9f
  Jakub Klinkovský authored 5 years ago
  
  11ba9c9f
- Swapped template parameters for methods in Meshes::Traverser so that UserData can be deduced · 1c31eac9
  Jakub Klinkovský authored 5 years ago
  
  1c31eac9
- Updated documentation in README.md · 87bf3605
  Jakub Klinkovský authored 5 years ago
  
  87bf3605
- Renamed prefixSum methods to scan · afba52d9
  Jakub Klinkovský authored 5 years ago
  
  Closes #49
  afba52d9
- Removed HostType and CudaType aliases in containers, matrices and grids · d070cc39
  Jakub Klinkovský authored 5 years ago
  
  They are not suitable for more than 2 devices/execution types; their design breaks the Open-Closed Principle. Instead, a type template "Self" was created, which allows to change any template parameter.
  d070cc39
- Removed useless typedefs such as ThisType · 3a997233
  Jakub Klinkovský authored 5 years ago
  
  3a997233
- Removed Containers::List because it has no benefits over std::list · 1b7361a9
  Jakub Klinkovský authored 5 years ago
  
  1b7361a9
- Fixed handling of --build parameter in the install script · 3ddc54a6
  Jakub Klinkovský authored 5 years ago
  
  3ddc54a6
Oct 25, 2019
- Enforce builds without (more or less) any warnings · 058aa8a9
  Jakub Klinkovský authored 5 years ago
  
  058aa8a9
- Added Devices::Sequential and corresponding specializations in TNL::Algorithms · 7756e2d0
  Jakub Klinkovský authored 5 years ago
  
  7756e2d0
- Serialization in TNL::File: File::save and File::load are specialized by... · dbfa5d11
  Jakub Klinkovský authored 5 years ago
  
  Serialization in TNL::File: File::save and File::load are specialized by Allocator instead of Device
  dbfa5d11
- Moved algorithms from TNL/Containers/Algorithms/ to just TNL/Algorithms/ · 399f9627
  Jakub Klinkovský authored 5 years ago
  
  The usage of algorithms such as MemoryOperations or Reduction is not bound to a particular container. On the other hand, ArrayIO, ArrayAssignment, VectorAssignment and StaticArrayAssignment are just implementation details for the containers - moved into TNL/Containers/detail/ Also moved ParallelFor, StaticFor, StaticVectorFor, TemplateStaticFor into TNL/Algorithms/
  399f9627
- Split ArrayOperations into MemoryOperations and MultiDeviceMemoryOperations · 57db358c
  Jakub Klinkovský authored 5 years ago
  
  This will be necessary to avoid code bloat with more than 2 devices (execution types).
  57db358c
- ArrayOperations: using more parallel algorithms and suitable sequential fallbacks · 986e25fc
  Jakub Klinkovský authored 5 years ago
  
  - cudaMemcpy is slower than our ParallelFor kernel for CUDA - use std::copy and std::equal instead of memcpy and memcmp, but only as sequential fallbacks - use parallel algorithms for containsValue and containsOnlyValue (again with sequential fallbacks)
  986e25fc
- ArrayOperations: added missing methods for the static/sequential specialization · f8c8673d
  Jakub Klinkovský authored 5 years ago
  
  f8c8673d
- Benchmarks: added benchmarks for array copy and compare using memcpy and memcmp · 7a5840de
  Jakub Klinkovský authored 5 years ago
  
  7a5840de
- Moved SystemInfo class out of the Devices namespace · dacc1711
  Jakub Klinkovský authored 5 years ago
  
  It has nothing to do with devices.
  dacc1711
- Cleaned up Devices::Cuda · e2ac7194
  Jakub Klinkovský authored 5 years ago
  
  e2ac7194
- Removed duplicate TransferBufferSize constants · a1a054bf
  Jakub Klinkovský authored 5 years ago
  
  Also set the buffer size to 1 MiB, because larger buffer size slows down memory copies significantly (e.g. MeshTest would take about 10x longer). Addresses #26
  a1a054bf
- Moved atomicAdd function from Devices/Cuda.h into Atomic.h · 15b5e2c4
  Jakub Klinkovský authored 5 years ago
  
  15b5e2c4
- Moved synchronization of smart pointers from Devices::Cuda into TNL::Pointers... · 1743358a
  Jakub Klinkovský authored 5 years ago
  
  Moved synchronization of smart pointers from Devices::Cuda into TNL::Pointers namespace as free functions synchronizeDevice() was renamed to synchronizeSmartPointersOnDevice() for clarity - there are many similarly named functions in CUDA (e.g. cudaDeviceSynchronize()).
  1743358a