Commits · 7c601df9ef64236946b8d5314ef1c9112e2e11e2 · TNL / tnl-dev

Aug 23, 2019
- changes in n-s right hand sides, turned on omp paralelization for tnl-view · 7c601df9
  Jan Schäfer authored 6 years ago
  
  7c601df9
- separed solvers, mended errors in N-S right hand sides · 12123fd1
  Jan Schäfer authored 6 years ago
  
  12123fd1
- Fixes after rebase · e5d13a30
  Jakub Klinkovský authored 6 years ago
  
  e5d13a30
- found and resolved problem with 3D riemann initial condition setter · 29a11cb4
  Jan Schäfer authored 6 years ago
  
  29a11cb4
- Added missing __cuda_callable__ flags to reduce the amount of compiler warnings · 35861927
  Jakub Klinkovský authored 6 years ago
  
  35861927
- added choice for diff. operator · 54c34731
  Jan Schäfer authored 6 years ago
  
  mended AUSM+ diff. operator
  54c34731
- added boundary conditions for model of boiler · 66a72280
  Jan Schäfer authored 6 years ago
  
  66a72280
- added AUSM+ differential operator · 4215d3b0
  root authored 6 years ago
  
  4215d3b0
- removed unused files · 56cfd64c
  Jan Schäfer authored 6 years ago
  
  56cfd64c
- Added dirichlet and neumann BC, resolved explicit updaters for BC · 65c38d77
  Jan Schäfer authored 6 years ago
  
  65c38d77
- all dif. operator and boundary conditions unified in flows · 2e559458
  Jan Schäfer authored 6 years ago
  
  2e559458
- finished merge, L-F and S-W operators changed to accept classes as equation's RHS · 56108685
  Jan Schäfer authored 6 years ago
  
  56108685
- resolved work for host /cuda in differential operators · b3690acc
  Jan Schäfer authored 6 years ago
  
  b3690acc
- removed solitare flow solvers for Euler and Navier-Stokes equations · e7846dad
  Jan Schäfer authored 6 years ago
  
  e7846dad
- added solver containing Steger-Warming, Van Leer and Lax Friedrich in one for... · 3bed5606
  Jan Schäfer authored 6 years ago
  
  added solver containing Steger-Warming, Van Leer and Lax Friedrich in one for Euler and Navier-Stokes
  3bed5606
Aug 17, 2019
- Implemented distributed prefix-sum · d13a2d18
  Jakub Klinkovský authored 5 years ago
  
  Fixes #43
  d13a2d18
- PrefixSum: separate first and second phase for OpenMP implementation and... · 174ad5fd
  Jakub Klinkovský authored 5 years ago
  
  PrefixSum: separate first and second phase for OpenMP implementation and expose performFirstPhase and performSecondPhase methods
  174ad5fd
- CUDA prefix-sum: separated the implementation of the first and second phase · 2c40015f
  Jakub Klinkovský authored 5 years ago
  
  2c40015f
- CUDA prefix-sum: moved gridShift from the first phase to the second phase · ac2ee07e
  Jakub Klinkovský authored 5 years ago
  
  ac2ee07e
- Replaced static member variables in CudaPrefixSumKernelLauncher with static getters · 1fe62640
  Jakub Klinkovský authored 5 years ago
  
  1fe62640
- Added default stream synchronizations after kernel launches in CudaPrefixSumKernel.h · af6d1d6b
  Jakub Klinkovský authored 5 years ago
  
  af6d1d6b
- Removed volatile reduction from PrefixSum and updated the normal reduction operation · 8d0d2638
  Jakub Klinkovský authored 5 years ago
  
  Same changes as for the regular Reduction operation...
  8d0d2638
- Added prefix-sum to BLAS benchmarks · 27631930
  Jakub Klinkovský authored 5 years ago
  
  27631930
- Implemented parallel prefix-sum with OpenMP · 7cc55dee
  Jakub Klinkovský authored 5 years ago
  
  Fixes #42
  7cc55dee
- Benchmarks: compute sample standard deviation of the measured computation times · 2bea9311
  Jakub Klinkovský authored 5 years ago
  
  2bea9311
- Removed timing parameter from benchmarks · e6e6cf46
  Jakub Klinkovský authored 5 years ago
  
  Benchmarks can be easily profiled even without this parameter, so it was just an unnecessary complication.
  e6e6cf46
- Benchmarks: added scalar multiplication with BLAS · 232be124
  Jakub Klinkovský authored 5 years ago
  
  232be124
- Optimized OpenMP thread counts for reduction and multireduction · 505d0b68
  Jakub Klinkovský authored 5 years ago
  
  505d0b68
- Ugly workaround for nvcc's stupid modification of `new` expressions · 32c69a11
  Jakub Klinkovský authored 5 years ago
  
  32c69a11
- Replaced custom lambda functions with instances of STL types where possible · 0a57393f
  Jakub Klinkovský authored 5 years ago
  
  0a57393f
- Changed reduction operation to use functions with `return a + b` instead of `a += b` · e20a0930
  Jakub Klinkovský authored 5 years ago
  
  This is nicer because it more clearly separates data load, computation and data store. Furthermore, it allows to use instances of std::plus, std::logical_and, std::logical_or, etc. instead of custom lambda functions.
  e20a0930
- Removed VectorOperations class which is now useless · d0fc1bb7
  Jakub Klinkovský authored 5 years ago
  
  It contained only methods for prefixSum and segmentedPrefixSum, which were identical for Host and Cuda, so they can be easily implemented directly in Vector and VectorView.
  d0fc1bb7
- Removed ReductionOperations.h · 1777e488
  Jakub Klinkovský authored 5 years ago
  
  1777e488
- Removed volatile reduction completely · 13b89a71
  Jakub Klinkovský authored 5 years ago
  
  13b89a71
- Found a way to avoid using volatile in CUDA reduction: __syncwarp() · cbc2fff9
  Jakub Klinkovský authored 5 years ago
  
  The performance seems to be identical to the code using volatile.
  cbc2fff9
- Rewritten multireduction with lambda functions · b74a24d2
  Jakub Klinkovský authored 5 years ago
  
  b74a24d2
- Style changes in the code for reduction · e470040a
  Jakub Klinkovský authored 5 years ago
  
  e470040a
- Removed reduction and multireduction declarations for MIC · 1a82b047
  Jakub Klinkovský authored 5 years ago
  
  They are not implemented anyway...
  1a82b047
- Tests: forced using a unique file name in each test · a3ba2469
  Jakub Klinkovský authored 5 years ago
  
  This is necessary to be able to run tests in parallel.
  a3ba2469
- Disabled long double in tnl-diff, tnl-init, tnl-lattice-init and tnl-view · 307b6650
  Jakub Klinkovský authored 5 years ago
  
  The build takes too long because of this and nobody uses it anyway. CUDA does not support long double in device code at all.
  307b6650