Skip to content
Snippets Groups Projects
Commit a6d80b09 authored by Tomáš Oberhuber's avatar Tomáš Oberhuber
Browse files

Adding todo note about a 'bug' in parallel reduction in CUDA.

parent cd6c5192
No related branches found
No related tags found
No related merge requests found
......@@ -123,7 +123,7 @@ CudaReductionKernel( Operation operation,
/***
* This runs in one warp so it is synchronized implicitly.
*/
*/
if( tid < 32 )
{
volatile ResultType* vsdata = sdata;
......@@ -132,6 +132,8 @@ CudaReductionKernel( Operation operation,
operation.commonReductionOnDevice( vsdata[ tid ], vsdata[ tid + 32 ] );
//printf( "4: tid %d data %f \n", tid, sdata[ tid ] );
}
// TODO: If blocksize == 32, the following does not work
// We do not check if tid < 16. Fix it!!!
if( blockSize >= 32 )
{
operation.commonReductionOnDevice( vsdata[ tid ], vsdata[ tid + 16 ] );
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment