Switching to "ExecutionType" instead of "DeviceType" (2ae9e97e) · Commits · TNL / tnl-dev

Commit 2ae9e97e authored Sep 04, 2019 by

Jakub Klinkovský

Switching to "ExecutionType" instead of "DeviceType"

This continues the split of Device into Execution and Allocator. The
execution types in TNL/Execution are: Sequential, OpenMP and Cuda
(Execution::OpenMP is instead of Devices::Host).

TODO:

- smart pointers: replace Device with Allocator (methods getData() and
  modifyData() should be removed, instead there should be getHostData()
  and getImageData() in both const and non-const variants)
- serialization: use a placeholder string (like "any") because data from files should be loadable with any Executor or Allocator
- revise BuildConfigTags for problem-solvers
- compatibility of Executors with Allocators
- dynamic execution policy - to specify runtime parameters for a
  specific (parallel) algorithm
   - implementation:
      some hierarchy of class templates which have the static execution policy as a template parameter
      specific classes for certain algorithms (like Reduction or PrefixSum)
      e.g. `DefaultExecutionParameters<CUDA>` → `ReductionExecutionParameters<CUDA>`
                                              → `PrefixSumExecutionParameters<CUDA>`
      most algorithms should use `DefaultExecutionParameters<DeviceType>`
   - then:
      - extend tests:
         ParallelFor: achieve full coverage with small array size
         finishing reduction and multireduction on host/GPU
         prefix-sum: specify suitable maxGridSize, blockSize, elementsInBlock and decrease VECTOR_TEST_SIZE
      - cuda reduction: profiling + probably change "finish" to launch only 1 block of threads
         - try zero-copy buffer on the host instead of CudaReductionBuffer
      - try using `ParallelFor` with a specific block size in LBM
      - custom kernel launch configuration for traversers

parent a556f79e

Hide whitespace changes

Inline Side-by-side

Please register or to comment