-
- Downloads
There was an error fetching the commit references. Please try again later.
Fixed configuration of reduction kernels
The problem is that the __CUDA_ARCH__ macro is defined only in device code, so it can't be used for configuring the kernel launches.
Showing
- src/TNL/Containers/Algorithms/CudaMultireductionKernel.h 11 additions, 6 deletionssrc/TNL/Containers/Algorithms/CudaMultireductionKernel.h
- src/TNL/Containers/Algorithms/CudaReductionKernel.h 13 additions, 7 deletionssrc/TNL/Containers/Algorithms/CudaReductionKernel.h
- src/TNL/Devices/CudaDeviceInfo.cpp 7 additions, 0 deletionssrc/TNL/Devices/CudaDeviceInfo.cpp
- src/TNL/Devices/CudaDeviceInfo.cu 15 additions, 0 deletionssrc/TNL/Devices/CudaDeviceInfo.cu
- src/TNL/Devices/CudaDeviceInfo.h 2 additions, 0 deletionssrc/TNL/Devices/CudaDeviceInfo.h
- src/TNL/Matrices/MatrixOperations.h 6 additions, 12 deletionssrc/TNL/Matrices/MatrixOperations.h
Loading
Please register or sign in to comment