Skip to content
Snippets Groups Projects
Select Git revision
No matching results
An error occurred while fetching branches. Retry the search.
An error occurred while fetching tags. Retry the search.
You can move around the graph by using the arrow keys.
Created with Raphaël 2.2.023Aug221716141098765432131Jul3029282726251615141312111097542125Jun24201918171424May529Apr2622212019161514131211653131Mar302928272582115Feb1410Documentation: fixed branch name for CI buildsMerge branch 'JK/ci' into 'develop'Documentation: added git commit id to the headerCI: split build and deploy documentation, use artifactsCI: use the dummy build job only for merge requestsMerge branch 'doc-deploy' into 'develop'CI: automatically skip builds for commits without any changes in the source filesSetup for automatic documentation deploymentMerge branch 'JK/multireduction' into 'develop'Implemented distributed prefix-sumPrefixSum: separate first and second phase for OpenMP implementation and expose performFirstPhase and performSecondPhase methodsCUDA prefix-sum: separated the implementation of the first and second phaseCUDA prefix-sum: moved gridShift from the first phase to the second phaseReplaced static member variables in CudaPrefixSumKernelLauncher with static gettersAdded default stream synchronizations after kernel launches in CudaPrefixSumKernel.hRemoved volatile reduction from PrefixSum and updated the normal reduction operationAdded prefix-sum to BLAS benchmarksImplemented parallel prefix-sum with OpenMPBenchmarks: compute sample standard deviation of the measured computation timesRemoved timing parameter from benchmarksBenchmarks: added scalar multiplication with BLASOptimized OpenMP thread counts for reduction and multireductionUgly workaround for nvcc's stupid modification of `new` expressionsReplaced custom lambda functions with instances of STL types where possibleChanged reduction operation to use functions with `return a + b` instead of `a += b`Removed VectorOperations class which is now uselessRemoved ReductionOperations.hRemoved volatile reduction completelyFound a way to avoid using volatile in CUDA reduction: __syncwarp()Rewritten multireduction with lambda functionsStyle changes in the code for reductionRemoved reduction and multireduction declarations for MICExecute tests in parallelTests: forced using a unique file name in each testDisabled long double in tnl-diff, tnl-init, tnl-lattice-init and tnl-viewEnabled link-time optimizationsRemoved unnecessary build options from CMakeLists.txtCleaned up communicators in tnl-initFixed move-constructor in ArrayMerge branch 'ndarray' into 'develop'
Loading