Skip to content
Commit cbd05a45 authored by Jakub Klinkovský's avatar Jakub Klinkovský
Browse files

Merge branch 'JK/multireduction' into 'develop'

Reduction and multireduction refactoring

Brief summary:

- rewritten multireduction using lambda functions
- avoided `volatile` using `__syncwarp()`
- using reduction functions with `return a + b` instead of `a += b`
- using `std::plus`, `std::multiplies`, `std::logical_and`, `std::logical_or`, etc. instead of custom lambda functions
- optimized OpenMP thread counts for reduction and multireduction
- added computation of sample standard deviation to benchmarks
- implemented parallel prefix-sum with OpenMP
- implemented distributed prefix-sum

See merge request !37
parents 95b2d990 d13a2d18
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment