Skip to content
Commit 1433c746 authored by Jakub Klinkovský's avatar Jakub Klinkovský
Browse files

Added tests of the reduction and scan algorithm with CustomScalar

This way we test both the general CUDA implementation using shared
memory and the specialization using __shfl instructions.

Both the reduction and scan kernels needed some tweaks due to shared
memory usage with non-fundamental types.
parent 2d454b15
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment