There was an error fetching the commit references. Please try again later.
Optimizing L1 and L2 norms in CUDA.
Showing
- src/core/cuda/CMakeLists.txt 2 additions, 0 deletionssrc/core/cuda/CMakeLists.txt
- src/core/cuda/cuda-reduction-diff-l2-norm_impl.cu 87 additions, 0 deletionssrc/core/cuda/cuda-reduction-diff-l2-norm_impl.cu
- src/core/cuda/cuda-reduction-l2-norm_impl.cu 80 additions, 0 deletionssrc/core/cuda/cuda-reduction-l2-norm_impl.cu
- src/core/cuda/reduction-operations.h 60 additions, 0 deletionssrc/core/cuda/reduction-operations.h
- src/core/vectors/tnlVectorOperations.h 34 additions, 3 deletionssrc/core/vectors/tnlVectorOperations.h
- src/core/vectors/tnlVectorOperationsCuda_impl.h 95 additions, 3 deletionssrc/core/vectors/tnlVectorOperationsCuda_impl.h
- src/core/vectors/tnlVectorOperationsHost_impl.h 93 additions, 43 deletionssrc/core/vectors/tnlVectorOperationsHost_impl.h
Loading
Please register or sign in to comment