- Jul 31, 2021
-
-
Jakub Klinkovský authored
The first phase performs only per-block reduction, not scan. The output array elements are written only in the second phase, so overall we perform only `n` instead of `2n` write operations.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Also fixed the idempotent values for Max and MaxWithArg (std::numerical_limits<T>::lowest() vs std::numerical_limits<T>::min())
-
Jakub Klinkovský authored
Hence, all StaticArray, Array, ArrayView and even expression templates are directly usable in reduction without the need to create a wrapping fetch functor. Also NDArray has this interface in 1D.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
The tests should not rely on other parts of the library if possible.
-
Jakub Klinkovský authored
- sequential scan does not need to be split, so "perform" performs the whole simple scan algorithm, "performFirstPhase" only reduces the block (i.e. the whole vector), "performSecondPhase" performs the scan operation with the block result combined with a global offset as the initial value - parallel OpenMP scan calls the sequential scan to process the block results - parallel CUDA scan was changed such that the block results array is an exclusive scan after the first phase, same as in the other device specializations
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
- used ValueType instead of RealType - closes #87 - replaced prefix-sum with scan in the comments - renamed variables containing "sum" to "result" - fixed artificial blockShifts in the sequential implementation
-
Jakub Klinkovský authored
The file should be named after the main function which is implemented in it. Also changed the parameter name from "reduce" to "reduction" to differentiate it from the main "reduce" function.
-
Jakub Klinkovský authored
extension of the implementation of staticFor See merge request !95
-
Tomáš Jakubec authored
-
- Jul 28, 2021
-
-
Jakub Klinkovský authored
This reverts commit 05859cdd.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
- Jul 27, 2021
-
-
Tomáš Oberhuber authored
To/sorting See merge request !99
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
Fixing header including in Nvidia bitonic sort wrapper. Fixing namespaces definition.
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-
Tomáš Oberhuber authored
-