ArrayOperations: using more parallel algorithms and suitable sequential fallbacks
- cudaMemcpy is slower than our ParallelFor kernel for CUDA - use std::copy and std::equal instead of memcpy and memcmp, but only as sequential fallbacks - use parallel algorithms for containsValue and containsOnlyValue (again with sequential fallbacks)
parent
f8c8673d
Please register or sign in to comment