Commit b81e52dc authored by Jakub Klinkovský's avatar Jakub Klinkovský
Browse files

working on the MHFEM section - recomputed results with TNL on multiple CPU nodes

parent 9d4cf490
Loading
Loading
Loading
Loading
+2 −2
Original line number Diff line number Diff line
@@ -3,7 +3,7 @@ Original data:
  (only GPU results)
- rci_2D: research_data/MHFEM/2021.03.08_mcwhdd_benchmark_rci/2D_triangles/
  (only CPU results for 2D)
- rci_3D_multinode: research_data/MHFEM/2021.10.26_mcwhdd_benchmark_rci/3D_tetrahedrons/
  (only CPU results for 3D, >24 ranks)
- rci_3D: research_data/MHFEM/2022.07.28_mcwhdd_benchmark_rci_tnl_recompute/tnl-mhfem/simulation_cases/mcwhdd/
  (only CPU results for 3D, <=24 ranks)
- rci_3D_multinode: research_data/MHFEM/2022.07.28_mcwhdd_benchmark_rci_tnl_recompute_multi-node/tnl-mhfem/simulation_cases/mcwhdd/
  (only CPU results for 3D, >24 ranks)
+13 −19
Original line number Diff line number Diff line
@@ -71,49 +71,43 @@ $ \np{24} $ & $ \np{2} $ & $ \np{1} $ &
15695.0  &  12.0  &  0.50  &  12184.2  &  15.5  &  0.65  &  2485.0  &  15.3  &  0.64 \\

$ \np{48} $  &  $ \np{4} $  &  $ \np{2} $  &  
  &    &    &  6171.4  &  30.6  &  0.64  &  1249.1  &  30.4  &  0.63 \\
  &    &    &  6029.0  &  31.3  &  0.65  &  1249.1  &  30.4  &  0.63 \\

$ \np{72} $  &  $ \np{6} $  &  $ \np{3} $  &  
  &    &    &  4026.3  &  46.9  &  0.65  &  880.2  &  43.2  &  0.60 \\
  &    &    &  4054.7  &  46.5  &  0.65  &  880.2  &  43.2  &  0.60 \\

$ \np{96} $  &  $ \np{8} $  &  $ \np{4} $  &  
  &    &    &  3016.0  &  62.6  &  0.65  &  592.3  &  64.1  &  0.67 \\
  &    &    &  2974.5  &  63.4  &  0.66  &  592.3  &  64.1  &  0.67 \\

$ \np{120} $  &  $ \np{10} $  &  $ \np{5} $  &  
  &    &    &  2374.4  &  79.5  &  0.66  &  471.2  &  80.6  &  0.67 \\
  &    &    &  2483.0  &  76.0  &  0.63  &  471.2  &  80.6  &  0.67 \\

$ \np{144} $  &  $ \np{12} $  &  $ \np{6} $  &  
  &    &    &  1968.2  &  95.9  &  0.67  &  415.8  &  91.4  &  0.63 \\
  &    &    &  2000.0  &  94.4  &  0.66  &  415.8  &  91.4  &  0.63 \\

$ \np{168} $  &  $ \np{14} $  &  $ \np{7} $  &  
  &    &    &  1643.1  &  114.8  &  0.68  &  372.2  &  102.1  &  0.61 \\
  &    &    &  1607.7  &  117.4  &  0.70  &  372.2  &  102.1  &  0.61 \\

$ \np{192} $  &  $ \np{16} $  &  $ \np{8} $  &  
  &    &    &  1410.4  &  133.8  &  0.70  &  310.7  &  122.3  &  0.64 \\
  &    &    &  1380.4  &  136.7  &  0.71  &  310.7  &  122.3  &  0.64 \\

$ \np{216} $  &  $ \np{18} $  &  $ \np{9} $  &  
  &    &    &  1242.5  &  151.9  &  0.70  &  277.5  &  136.9  &  0.63 \\
  &    &    &  1209.6  &  156.0  &  0.72  &  277.5  &  136.9  &  0.63 \\

$ \np{240} $  &  $ \np{20} $  &  $ \np{10} $  &  
  &    &    &  1114.3  &  169.4  &  0.71  &  240.3  &  158.1  &  0.66 \\
  &    &    &  1082.0  &  174.4  &  0.73  &  240.3  &  158.1  &  0.66 \\

$ \np{264} $  &  $ \np{22} $  &  $ \np{11} $  &  
  &    &    &  1003.8  &  188.0  &  0.71  &  251.5  &  151.0  &  0.57 \\
  &    &    &  974.9  &  193.6  &  0.73  &  251.5  &  151.0  &  0.57 \\

$ \np{288} $  &  $ \np{24} $  &  $ \np{12} $  &  
  &    &    &  924.2  &  204.2  &  0.71  &  223.9  &  169.7  &  0.59 \\
  &    &    &  892.5  &  211.4  &  0.73  &  223.9  &  169.7  &  0.59 \\

$ \np{312} $  &  $ \np{26} $  &  $ \np{13} $  &  
  &    &    &  860.5  &  219.3  &  0.70  &  202.9  &  187.2  &  0.60 \\
  &    &    &  901.8  &  209.3  &  0.67  &  202.9  &  187.2  &  0.60 \\

$ \np{336} $  &  $ \np{28} $  &  $ \np{14} $  &  
  &    &    &  807.3  &  233.8  &  0.70  &  201.9  &  188.2  &  0.56 \\

$ \np{360} $  &  $ \np{30} $  &  $ \np{15} $  &  
  &    &    &  761.6  &  247.8  &  0.69  &    &    &   \\

$ \np{384} $  &  $ \np{32} $  &  $ \np{16} $  &  
  &    &    &  702.4  &  268.7  &  0.70  &    &    &   \\
  &    &    &  851.9  &  221.5  &  0.66  &  201.9  &  188.2  &  0.56 \\

\bottomrule
\end{tabular}
+31 −30
Original line number Diff line number Diff line
@@ -14,11 +14,11 @@
| Material model:                                                   BrooksCorey |
| Formulation:                                                             PwPn |
+-------------------------------------------------------------------------------+
| Host name:                                                                n16 |
| Host name:                                                                n01 |
| System:                                                                 Linux |
| Release:                                           3.10.0-957.10.1.el7.x86_64 |
| Release:                                          3.10.0-1127.13.1.el7.x86_64 |
| Architecture:                                                          x86_64 |
| TNL compiler:                                                 GNU G++ (9.3.0) |
| TNL compiler:                                                GNU G++ (10.2.0) |
| CPU info                                                                      |
|  Model name:                         Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz |
|  Cores:                                                                    12 |
@@ -26,34 +26,35 @@
|  Max clock rate (in MHz):                                                3001 |
|  Cache (L1d, L1i, L2, L3):                                32, 32, 1024, 25344 |
+-------------------------------------------------------------------------------+
| Started at:                                         Wed Oct 27 2021, 08:54:19 |
| Started at:                                         Tue Aug 02 2022, 09:24:37 |
+-------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------+
| Finished at:                                        Wed Oct 27 2021, 09:33:55 |
| Total count of linear solver iterations:                               426426 |
| Pre-iterate time:                                                     21.8379 |
|   nonlinear update time:                                              6.77782 |
|   update_b time:                                                       5.0516 |
|   upwind update time:                                                 3.99432 |
|   upwind MPI synchronization time:                                    1.61482 |
|   update_R time:                                                      2.56532 |
|   update_Q time:                                                      1.81385 |
|   model pre-iterate time:                                          0.00108094 |
| Linear system assembler time:                                         13.2039 |
| Linear preconditioner update time:                                   0.962032 |
| Linear system solver time:                                            2335.36 |
|   MPI synchronizations count:                                          854452 |
|   MPI synchronization time:                                           180.265 |
|     async wait before start time:                                           0 |
|     async start time:                                                 243.503 |
|     async wait time:                                                        0 |
| Post-iterate time:                                                    2.10193 |
|   Z_iF -> Z_iK update time:                                          0.602696 |
|   velocities update time:                                              1.4911 |
|   model post-iterate time:                                        0.000900124 |
| Finished at:                                        Tue Aug 02 2022, 10:06:02 |
| Total number of linear solver iterations:                              425292 |
| Total number of time steps:                                               800 |
| Pre-iterate time: avg: 2.607881e+01 stddev: 2.772884e+00 min: 2.184665e+01 max: 3.179474e+01 |
|   nonlinear update time: avg: 6.813193e+00 stddev: 1.655731e-01 min: 6.429909e+00 max: 7.154507e+00 |
|   update_b time: avg: 6.170690e+00 stddev: 2.516327e-01 min: 5.743104e+00 max: 7.364855e+00 |
|   upwind update time: avg: 3.694024e+00 stddev: 1.539142e-01 min: 3.445674e+00 max: 4.177881e+00 |
|   upwind MPI synchronization time: avg: 5.268176e+00 stddev: 2.532207e+00 min: 1.744063e+00 max: 1.079291e+01 |
|   update_R time: avg: 2.371868e+00 stddev: 1.024903e-01 min: 2.122180e+00 max: 2.626387e+00 |
|   update_Q time: avg: 1.740290e+00 stddev: 4.547246e-02 min: 1.593745e+00 max: 1.843588e+00 |
|   model pre-iterate time: avg: 1.086477e-03 stddev: 8.702902e-05 min: 9.150040e-04 max: 1.355301e-03 |
| Linear system assembler time: avg: 1.442325e+01 stddev: 5.291571e-01 min: 1.329541e+01 max: 1.579588e+01 |
| Linear preconditioner update time: avg: 8.689404e-01 stddev: 4.150531e-02 min: 6.801816e-01 max: 9.416914e-01 |
| Linear system solver time: avg: 2.439947e+03 stddev: 2.943175e+00 min: 2.433416e+03 max: 2.444364e+03 |
|   faceSynchronizer async operations count:                             852185 |
|   faceSynchronizer async operations time: avg: 2.041594e+02 stddev: 4.713659e+01 min: 1.079442e+02 max: 3.308697e+02 |
|     async wait before start time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
|     async start time: avg: 2.041594e+02 stddev: 4.713659e+01 min: 1.079442e+02 max: 3.308697e+02 |
|     async wait time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
| Post-iterate time: avg: 2.059160e+00 stddev: 6.237737e-02 min: 1.912103e+00 max: 2.191095e+00 |
|   Z_iF -> Z_iK update time: avg: 5.954669e-01 stddev: 1.796252e-02 min: 5.389611e-01 max: 6.318699e-01 |
|   velocities update time: avg: 1.455658e+00 stddev: 4.888862e-02 min: 1.339857e+00 max: 1.556415e+00 |
|   model post-iterate time: avg: 9.208996e-04 stddev: 8.069719e-05 min: 7.779590e-04 max: 1.156546e-03 |
| MPI operations (included in the previous phases):                             |
|   MPI_Allreduce time:                                                 103.594 |
| Compute time:                                                         2374.41 |
| I/O time:                                                             1.47915 |
| Total time:                                                            2376.6 |
|   MPI_Allreduce time: avg: 2.876816e+02 stddev: 6.369748e+01 min: 1.644152e+02 max: 4.870491e+02 |
| Compute time:                                                         2482.99 |
| I/O time:                                                             1.69094 |
| Total time:                                                           2485.42 |
+-------------------------------------------------------------------------------+
+31 −30
Original line number Diff line number Diff line
@@ -14,11 +14,11 @@
| Material model:                                                   BrooksCorey |
| Formulation:                                                             PwPn |
+-------------------------------------------------------------------------------+
| Host name:                                                                n13 |
| Host name:                                                                n04 |
| System:                                                                 Linux |
| Release:                                           3.10.0-957.10.1.el7.x86_64 |
| Release:                                          3.10.0-1127.13.1.el7.x86_64 |
| Architecture:                                                          x86_64 |
| TNL compiler:                                                 GNU G++ (9.3.0) |
| TNL compiler:                                                GNU G++ (10.2.0) |
| CPU info                                                                      |
|  Model name:                         Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz |
|  Cores:                                                                    12 |
@@ -26,34 +26,35 @@
|  Max clock rate (in MHz):                                                3001 |
|  Cache (L1d, L1i, L2, L3):                                32, 32, 1024, 25344 |
+-------------------------------------------------------------------------------+
| Started at:                                         Wed Oct 27 2021, 09:35:01 |
| Started at:                                         Tue Aug 02 2022, 08:49:04 |
+-------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------+
| Finished at:                                        Wed Oct 27 2021, 10:07:51 |
| Total count of linear solver iterations:                               425776 |
| Pre-iterate time:                                                     18.5396 |
|   nonlinear update time:                                              5.75978 |
|   update_b time:                                                      4.17225 |
|   upwind update time:                                                 3.22175 |
|   upwind MPI synchronization time:                                    1.48792 |
|   update_R time:                                                      2.22835 |
|   update_Q time:                                                      1.55086 |
|   model pre-iterate time:                                          0.00099187 |
| Linear system assembler time:                                         11.2656 |
| Linear preconditioner update time:                                   0.809853 |
| Linear system solver time:                                            1935.78 |
|   MPI synchronizations count:                                          853152 |
|   MPI synchronization time:                                           156.014 |
|     async wait before start time:                                           0 |
|     async start time:                                                 156.014 |
|     async wait time:                                                        0 |
| Post-iterate time:                                                    1.80103 |
|   Z_iF -> Z_iK update time:                                          0.510155 |
|   velocities update time:                                             1.28328 |
|   model post-iterate time:                                        0.000923872 |
| Finished at:                                        Tue Aug 02 2022, 09:22:27 |
| Total number of linear solver iterations:                              422465 |
| Total number of time steps:                                               800 |
| Pre-iterate time: avg: 2.112036e+01 stddev: 1.679580e+00 min: 1.816553e+01 max: 2.511782e+01 |
|   nonlinear update time: avg: 5.726919e+00 stddev: 1.470366e-01 min: 5.276186e+00 max: 6.049857e+00 |
|   update_b time: avg: 5.145289e+00 stddev: 1.346186e-01 min: 4.748365e+00 max: 5.531417e+00 |
|   upwind update time: avg: 3.055560e+00 stddev: 1.116210e-01 min: 2.832433e+00 max: 3.526188e+00 |
|   upwind MPI synchronization time: avg: 3.738538e+00 stddev: 1.593089e+00 min: 1.195956e+00 max: 7.824148e+00 |
|   update_R time: avg: 1.972653e+00 stddev: 1.056654e-01 min: 1.685340e+00 max: 2.164914e+00 |
|   update_Q time: avg: 1.460958e+00 stddev: 5.588222e-02 min: 1.288470e+00 max: 1.598162e+00 |
|   model pre-iterate time: avg: 1.089174e-03 stddev: 8.436100e-05 min: 9.011600e-04 max: 1.336621e-03 |
| Linear system assembler time: avg: 1.184110e+01 stddev: 4.545254e-01 min: 1.075709e+01 max: 1.348738e+01 |
| Linear preconditioner update time: avg: 7.381469e-01 stddev: 3.625849e-02 min: 6.217543e-01 max: 8.306800e-01 |
| Linear system solver time: avg: 1.964211e+03 stddev: 1.941292e+00 min: 1.960020e+03 max: 1.967813e+03 |
|   faceSynchronizer async operations count:                             846530 |
|   faceSynchronizer async operations time: avg: 1.670798e+02 stddev: 3.719086e+01 min: 8.631573e+01 max: 2.594999e+02 |
|     async wait before start time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
|     async start time: avg: 1.670798e+02 stddev: 3.719086e+01 min: 8.631573e+01 max: 2.594999e+02 |
|     async wait time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
| Post-iterate time: avg: 1.737339e+00 stddev: 5.246855e-02 min: 1.572052e+00 max: 1.842635e+00 |
|   Z_iF -> Z_iK update time: avg: 4.997481e-01 stddev: 1.428420e-02 min: 4.581276e-01 max: 5.286212e-01 |
|   velocities update time: avg: 1.229411e+00 stddev: 4.274759e-02 min: 1.104548e+00 max: 1.306016e+00 |
|   model post-iterate time: avg: 9.222890e-04 stddev: 9.277572e-05 min: 7.468810e-04 max: 1.167797e-03 |
| MPI operations (included in the previous phases):                             |
|   MPI_Allreduce time:                                                 194.002 |
| Compute time:                                                         1968.21 |
| I/O time:                                                              1.7977 |
| Total time:                                                           1970.72 |
|   MPI_Allreduce time: avg: 2.376998e+02 stddev: 6.139177e+01 min: 1.194775e+02 max: 4.514746e+02 |
| Compute time:                                                         1999.95 |
| I/O time:                                                             1.90638 |
| Total time:                                                           2002.68 |
+-------------------------------------------------------------------------------+
+30 −29
Original line number Diff line number Diff line
@@ -14,11 +14,11 @@
| Material model:                                                   BrooksCorey |
| Formulation:                                                             PwPn |
+-------------------------------------------------------------------------------+
| Host name:                                                                n19 |
| Host name:                                                                n15 |
| System:                                                                 Linux |
| Release:                                          3.10.0-1127.13.1.el7.x86_64 |
| Architecture:                                                          x86_64 |
| TNL compiler:                                                 GNU G++ (9.3.0) |
| TNL compiler:                                                GNU G++ (10.2.0) |
| CPU info                                                                      |
|  Model name:                         Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz |
|  Cores:                                                                    12 |
@@ -26,34 +26,35 @@
|  Max clock rate (in MHz):                                                3001 |
|  Cache (L1d, L1i, L2, L3):                                32, 32, 1024, 25344 |
+-------------------------------------------------------------------------------+
| Started at:                                         Wed Oct 27 2021, 10:10:00 |
| Started at:                                         Tue Aug 02 2022, 12:23:04 |
+-------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------+
| Finished at:                                        Wed Oct 27 2021, 10:37:25 |
| Total count of linear solver iterations:                               424456 |
| Pre-iterate time:                                                      15.385 |
|   nonlinear update time:                                              4.97056 |
|   update_b time:                                                      3.61098 |
|   upwind update time:                                                 2.64097 |
|   upwind MPI synchronization time:                                    1.02801 |
|   update_R time:                                                      1.81289 |
|   update_Q time:                                                      1.30119 |
|   model pre-iterate time:                                         0.000940792 |
| Linear system assembler time:                                         8.77253 |
| Linear preconditioner update time:                                   0.617861 |
| Linear system solver time:                                            1616.86 |
|   MPI synchronizations count:                                          850512 |
|   MPI synchronization time:                                           166.343 |
|     async wait before start time:                                           0 |
|     async start time:                                                 166.343 |
|     async wait time:                                                        0 |
| Post-iterate time:                                                     1.4917 |
|   Z_iF -> Z_iK update time:                                          0.419422 |
|   velocities update time:                                              1.0656 |
|   model post-iterate time:                                        0.000756294 |
| Finished at:                                        Tue Aug 02 2022, 12:49:54 |
| Total number of linear solver iterations:                              423962 |
| Total number of time steps:                                               800 |
| Pre-iterate time: avg: 1.640892e+01 stddev: 2.808213e-01 min: 1.574747e+01 max: 1.698229e+01 |
|   nonlinear update time: avg: 4.921517e+00 stddev: 1.199586e-01 min: 4.535830e+00 max: 5.490957e+00 |
|   update_b time: avg: 4.418443e+00 stddev: 1.033474e-01 min: 4.061977e+00 max: 4.623125e+00 |
|   upwind update time: avg: 2.565352e+00 stddev: 6.183473e-02 min: 2.447350e+00 max: 2.742922e+00 |
|   upwind MPI synchronization time: avg: 1.351795e+00 stddev: 2.864930e-01 min: 6.277810e-01 max: 2.154879e+00 |
|   update_R time: avg: 1.809624e+00 stddev: 5.347420e-02 min: 1.636293e+00 max: 1.949937e+00 |
|   update_Q time: avg: 1.321394e+00 stddev: 2.795142e-02 min: 1.204020e+00 max: 1.385940e+00 |
|   model pre-iterate time: avg: 1.108743e-03 stddev: 8.234657e-05 min: 9.572620e-04 max: 1.453031e-03 |
| Linear system assembler time: avg: 1.003331e+01 stddev: 3.643277e-01 min: 9.071919e+00 max: 1.086738e+01 |
| Linear preconditioner update time: avg: 6.531414e-01 stddev: 3.091586e-02 min: 5.586080e-01 max: 7.105389e-01 |
| Linear system solver time: avg: 1.579011e+03 stddev: 4.373968e-01 min: 1.577943e+03 max: 1.580280e+03 |
|   faceSynchronizer async operations count:                             849524 |
|   faceSynchronizer async operations time: avg: 1.313080e+02 stddev: 2.694098e+01 min: 6.572595e+01 max: 2.058882e+02 |
|     async wait before start time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
|     async start time: avg: 1.313080e+02 stddev: 2.694098e+01 min: 6.572595e+01 max: 2.058882e+02 |
|     async wait time: avg: 0.000000e+00 stddev: 0.000000e+00 min: 0.000000e+00 max: 0.000000e+00 |
| Post-iterate time: avg: 1.510155e+00 stddev: 3.323097e-02 min: 1.380130e+00 max: 1.580664e+00 |
|   Z_iF -> Z_iK update time: avg: 4.194220e-01 stddev: 1.093519e-02 min: 3.787597e-01 max: 4.437735e-01 |
|   velocities update time: avg: 1.083477e+00 stddev: 2.356489e-02 min: 9.863528e-01 max: 1.131972e+00 |
|   model post-iterate time: avg: 9.133141e-04 stddev: 7.939826e-05 min: 7.015870e-04 max: 1.175846e-03 |
| MPI operations (included in the previous phases):                             |
|   MPI_Allreduce time:                                                 155.039 |
| Compute time:                                                         1643.14 |
| I/O time:                                                             2.31154 |
| Total time:                                                           1646.58 |
|   MPI_Allreduce time: avg: 1.352334e+02 stddev: 2.992143e+01 min: 6.127135e+01 max: 2.075741e+02 |
| Compute time:                                                         1607.72 |
| I/O time:                                                             1.78985 |
| Total time:                                                           1610.09 |
+-------------------------------------------------------------------------------+
Loading