-
- Downloads
There was an error fetching the commit references. Please try again later.
Merge branch 'JK/multireduction' into 'develop'
Reduction and multireduction refactoring Brief summary: - rewritten multireduction using lambda functions - avoided `volatile` using `__syncwarp()` - using reduction functions with `return a + b` instead of `a += b` - using `std::plus`, `std::multiplies`, `std::logical_and`, `std::logical_or`, etc. instead of custom lambda functions - optimized OpenMP thread counts for reduction and multireduction - added computation of sample standard deviation to benchmarks - implemented parallel prefix-sum with OpenMP - implemented distributed prefix-sum See merge request !37
Showing
- src/Benchmarks/BLAS/CommonVectorOperations.h 0 additions, 2 deletionssrc/Benchmarks/BLAS/CommonVectorOperations.h
- src/Benchmarks/BLAS/CommonVectorOperations.hpp 36 additions, 63 deletionssrc/Benchmarks/BLAS/CommonVectorOperations.hpp
- src/Benchmarks/BLAS/vector-operations.h 36 additions, 34 deletionssrc/Benchmarks/BLAS/vector-operations.h
- src/Benchmarks/Benchmarks.h 19 additions, 40 deletionssrc/Benchmarks/Benchmarks.h
- src/Benchmarks/FunctionTimer.h 78 additions, 93 deletionssrc/Benchmarks/FunctionTimer.h
- src/Benchmarks/HeatEquation/tnl-benchmark-simple-heat-equation.h 10 additions, 1 deletion...chmarks/HeatEquation/tnl-benchmark-simple-heat-equation.h
- src/Benchmarks/Logging.h 182 additions, 183 deletionssrc/Benchmarks/Logging.h
- src/TNL/Containers/Algorithms/ArrayOperationsCuda.hpp 6 additions, 13 deletionssrc/TNL/Containers/Algorithms/ArrayOperationsCuda.hpp
- src/TNL/Containers/Algorithms/ArrayOperationsHost.hpp 0 additions, 1 deletionsrc/TNL/Containers/Algorithms/ArrayOperationsHost.hpp
- src/TNL/Containers/Algorithms/ArrayOperationsMIC.hpp 0 additions, 1 deletionsrc/TNL/Containers/Algorithms/ArrayOperationsMIC.hpp
- src/TNL/Containers/Algorithms/CudaMultireductionKernel.h 108 additions, 141 deletionssrc/TNL/Containers/Algorithms/CudaMultireductionKernel.h
- src/TNL/Containers/Algorithms/CudaPrefixSumKernel.h 211 additions, 195 deletionssrc/TNL/Containers/Algorithms/CudaPrefixSumKernel.h
- src/TNL/Containers/Algorithms/CudaReductionKernel.h 267 additions, 289 deletionssrc/TNL/Containers/Algorithms/CudaReductionKernel.h
- src/TNL/Containers/Algorithms/DistributedPrefixSum.h 70 additions, 0 deletionssrc/TNL/Containers/Algorithms/DistributedPrefixSum.h
- src/TNL/Containers/Algorithms/Multireduction.h 49 additions, 37 deletionssrc/TNL/Containers/Algorithms/Multireduction.h
- src/TNL/Containers/Algorithms/Multireduction.hpp 229 additions, 0 deletionssrc/TNL/Containers/Algorithms/Multireduction.hpp
- src/TNL/Containers/Algorithms/Multireduction_impl.h 0 additions, 327 deletionssrc/TNL/Containers/Algorithms/Multireduction_impl.h
- src/TNL/Containers/Algorithms/PrefixSum.h 87 additions, 58 deletionssrc/TNL/Containers/Algorithms/PrefixSum.h
- src/TNL/Containers/Algorithms/PrefixSum.hpp 190 additions, 76 deletionssrc/TNL/Containers/Algorithms/PrefixSum.hpp
- src/TNL/Containers/Algorithms/PrefixSumType.h 0 additions, 24 deletionssrc/TNL/Containers/Algorithms/PrefixSumType.h
Loading
Please register or sign in to comment