-
Jakub Klinkovský authored
- cudaMemcpy is slower than our ParallelFor kernel for CUDA - use std::copy and std::equal instead of memcpy and memcmp, but only as sequential fallbacks - use parallel algorithms for containsValue and containsOnlyValue (again with sequential fallbacks)
986e25fc