This tutorial introduces flexible parallel reduction in TNL. It shows how to easily implement parallel reduction with user defined operations which may run on both CPU and GPU. Parallel reduction is a programming pattern appering very often in different kind of algorithms for example in scalar product, vector norms or mean value evaluation but also in sequences or strings comparison.
We will explain the *flexible parallel reduction* on several examples. We start with the simplest sum of sequence of numbers folowed by more advanced problems like scalar product or vector norms.