- Aug 27, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Tomáš Oberhuber authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
- Aug 26, 2019
-
-
Jakub Klinkovský authored
It caused a weird compiler error when compiled with g++ -Werror.
-
- Aug 25, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
- Aug 24, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
And running CUDA binaries doesn't crash anymore, though they still do not produce any meaningful coverage report, even for host code. This is due to nvcc's temporary source files, we will have to compile with clang++ natively to fix this.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
They lead to internal compiler error in some cases.
-
- Aug 23, 2019
-
-
Jakub Klinkovský authored
Documentation: fixed branch name for CI builds See merge request !40
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
CI: deploy with artifacts See merge request !39
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
- Aug 22, 2019
-
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Setup for automatic documentation deployment See merge request !38
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
- Aug 17, 2019
-
-
Jakub Klinkovský authored
Reduction and multireduction refactoring Brief summary: - rewritten multireduction using lambda functions - avoided `volatile` using `__syncwarp()` - using reduction functions with `return a + b` instead of `a += b` - using `std::plus`, `std::multiplies`, `std::logical_and`, `std::logical_or`, etc. instead of custom lambda functions - optimized OpenMP thread counts for reduction and multireduction - added computation of sample standard deviation to benchmarks - implemented parallel prefix-sum with OpenMP - implemented distributed prefix-sum See merge request !37
-
Jakub Klinkovský authored
Fixes #43
-
Jakub Klinkovský authored
PrefixSum: separate first and second phase for OpenMP implementation and expose performFirstPhase and performSecondPhase methods
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Same changes as for the regular Reduction operation...
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Fixes #42
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
Benchmarks can be easily profiled even without this parameter, so it was just an unnecessary complication.
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
-
Jakub Klinkovský authored
This is nicer because it more clearly separates data load, computation and data store. Furthermore, it allows to use instances of std::plus, std::logical_and, std::logical_or, etc. instead of custom lambda functions.
-