Replaced explicit algorithms for host in VectorOperations with general implementation
According to benchmarks, there is practically no difference in performance. Only explicit unrolling is helpful, but that has been implemented for the general algorithm in Reduction::reduce as well.
Loading
Please register or sign in to comment