Commit 8accbc52 authored by Jakub Klinkovský's avatar Jakub Klinkovský
Browse files

Optimized parallel CUDA scan algorithm to avoid unnecessary writing in the first phase

The original approach (prescan + uniform shift) is more efficient for
inputs that are expensive to evaluate, such as vector expressions.
parent 2f61104b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment