The measurements of \cite{trautz2017development,trautz2017role} are available as a public dataset~\cite{trautz:dataset}.
Note that only a subset of these measurements is relevant for the purpose of this paper, namely the airflow properties (velocity, RMS, Reynolds stress) and relative humidity above the soil surface.
Note that only a subset of these measurements is relevant for the purpose of this chapter, namely the airflow properties (velocity, RMS, Reynolds stress) and relative humidity above the soil surface.
Given the size of the domain and the experimental setup, the number of airflow and relative humidity measurements were necessarily constrained.
The locations of airflow and relative humidity were varied between the three configurations based on the spacing distance between the synthetic plants and the resulting flow regime created.
The measurement locations are highlighted in the figures in the last section of this paper.\todo{thesis}
The measurement locations are highlighted in the figures in the last section of this chapter.\todo{Check if \cref{sec:results} is really the last section, maybe add cref to the text.}
The laser used to make the flow measurements was mounted on an automated traverse located outside the wind tunnel test-section.
Uncertainty in the exact location where the measurements were made can be given as $\delta x =\delta y =\SI{\pm5}{\mm}$ and $\delta z =\SI{\pm1}{\mm}$.
The sensor used to measure relative humidity was similarly mounted on an automated traverse located within the test-section; uncertainty associated with this system is given as $\delta x =\SI{\pm10}{\mm}$ and $\delta y =\delta z =\SI{\pm5}{\mm}$.
@@ -217,8 +217,11 @@ Although a slightly variable temperature distribution above the soil tank was ob
The isothermal model given by \cref{eq:ns,eq:transport} is used with constant parameters $\rho$, $\nu$, and $D_{H_2O}$.
Furthermore, the fluid density $\rho$ is assumed to be independent of the relative humidity $\phi$ and the effect of gravity is neglected due to the dimensions of the experimental facility.
Model parameters for air under standard atmospheric conditions are given in~\cref{tab:physical parameters}.
\begin{table}[!h]
\begin{table}[tb]
\centering
\caption{Model parameters for air under standard atmospheric conditions (\SI{25}{\degreeCelsius} and pressure of \SI{1}{bar}).}
\label{tab:physical parameters}
\begin{tabular}{lr}
\toprule
Density $\rho$\cite{engeneeringtoolbox:air-viscosity}&\SI{1.184}{\kg\per\cubic\metre}\\
@@ -227,8 +230,6 @@ Model parameters for air under standard atmospheric conditions are given in~\cre
Molecular diffusivity $D_{H_2O}$ of water vapor in air \cite{massman1998}&\SI{25.52e-6}{\metre\squared\per\second}\\
\bottomrule
\end{tabular}
\caption{Model parameters for air under standard atmospheric conditions (\SI{25}{\degreeCelsius} and pressure of \SI{1}{bar}).}
\label{tab:physical parameters}
\end{table}
For the purpose of this thesis we are interested in simulating only the free flow region, where the soil-atmosphere and synthetic plant (bluff body)--atmosphere interfaces are treated using boundary conditions (described in \cref{sec:boundary conditions}).
@@ -239,7 +240,7 @@ The upper side of the domain $\Omega_1$ coincides with the inclined ceiling of t
The bottom boundaries of both domains $\Omega_1$ and $\Omega_2$ coincide with the soil surface in which the two synthetic plants were planted as illustrated in~\cref{fig:domain}.
The dimensions of the subdomain $\Omega_2$ are \qtyproduct{2.94 x 0.7 x 0.5}{\metre}.
%\caption{Illustration of the computational domains. The outer block with inclined ceiling represents a domain $\Omega_1$ for \cref{eq:ns}, the smaller block represents a subdomain $\Omega_2$ for \cref{eq:transport}, and the two dark rectangles at the bottom side represent the synthetic plant blocks. Air flow direction is from right to left.}
@@ -266,6 +267,7 @@ The sides, ceiling, and floor (including soil) of the test-section are impermeab
A simple outflow condition based on a fixed pressure value and a zero velocity gradient in the normal direction is used on the left hand side of the domain in \cref{fig:domain} (i.e., test-section exit).
On the right hand side of the domains~$\Omega_1$ and~$\Omega_2$ in \cref{fig:domain}, the inflow boundary condition for velocity $\vec v$ and relative humidity~$\phi$, respectively, are prescribed.
Several inflow boundary conditions considered in this work are discussed in the following subsections; the final subsection below addresses the applied boundary conditions for the "transpiration" from the synthetic plants.
\todo{Specify the condition for $\phi$ on the free-stream boundaries of $\Omega_2$ (\ic{AdvectiveOutflow} in the code). Clarify that fixed values are prescribed on the bottom side.}
\subsubsection{Inflow: mean velocity profile}
\label{sec:inflow:time-constant}
@@ -333,8 +335,10 @@ $z_{\mathrm{const}} = \SI{0.195}{\metre}$ corresponds to the height of the synth
and the remaining parameters $\phi_{\mathrm{min}}$, $\phi_{\mathrm{max}}$, $\gamma$ are fitted values to the experimental data (see~\cref{tab:inflow:fitted values} and the supplementary materials).
Furthermore, the value $\phi_{\mathrm{in}}(0)$ is used for the fixed-value boundary condition at the bottom side of the subdomain~$\Omega_2$ that coincides with the interface between the soil tank and free space.
\begin{table}[!ht]
\begin{table}[htb]
\centering
\caption{Values of fitted parameters $\phi_{\mathrm{min}}$, $\phi_{\mathrm{max}}$, $\gamma$ for the relative humidity inflow boundary condition~\eqref{eq:inflow rh}.}
\label{tab:inflow:fitted values}
\begin{tabular}{lrrr}
\toprule
& EX-1 & EX-2 & EX-3 \\
@@ -344,8 +348,6 @@ Furthermore, the value $\phi_{\mathrm{in}}(0)$ is used for the fixed-value bound
$\gamma$& 0.184443 & 0.262402 & 0.402284 \\
\bottomrule
\end{tabular}
\caption{Values of fitted parameters $\phi_{\mathrm{min}}$, $\phi_{\mathrm{max}}$, $\gamma$ for the relative humidity inflow boundary condition~\eqref{eq:inflow rh}.}
\label{tab:inflow:fitted values}
\end{table}
\subsubsection{Synthetic plants}
@@ -366,25 +368,18 @@ On the remaining sides of the plants, the standard no-slip condition is used as
On the upstream side, the velocity would have to be prescribed in the direction opposite to the mean flow, and on the sides parallel to the mean flow it would have to be prescribed in the tangential direction.
However, water vapor is still released diffusively from the sides where the no-slip condition is used.
\caption{Relative humidity profile prescribed as the boundary condition on the synthetic plants in the configuration with \SI{45}{\cm} spacing (EX-2).}
\label{fig:block rh:WT02_2}
\end{figure}
\section{Numerical solution}
\label{sec:WT:numerical solution}
\inline{Connect to \cref{chapter:MHFEM,chapter:LBM,chapter:LBM-MHFEM}}
\section{Computational methodology}
To solve the system of Eqs.~\eqref{eq:NumDwarf}, suitable initial and boundary conditions must be supplied.
A simple Dirichlet-type boundary condition is used to prescribe fixed values of the unknown function on parts of the domain boundary $\partial\Omega_2$ as described in \cref{sec:boundary conditions}.
On the remaining parts of $\partial\Omega_2$, which are either impermeable walls where the no-slip condition on velocity is imposed or free-stream boundaries of $\Omega_2$ inside $\Omega_1$, a Neumann-type condition is used to prescribe zero gradient of the unknown function in the normal direction.
The coupled solver uses a regular lattice for LBM and an unstructured mesh for MHFEM.
The use of the lattice is the main limiting factor for the flexibility of the solver, because extra care must be taken when setting up a simulation to ensure proper alignment of immersed boundaries such as the synthetic plants used in this paper.
This could be improved by using interpolated boundary conditions for LBM \cite{kruger:2017lattice}, they are however not currently implemented in our solver.
The coupled LBM-MHFEM solver described in \cref{chapter:LBM-MHFEM} is used for the numerical solution of \cref{eq:ns,eq:transport}.
The use of a regular lattice for the flow domain discretization is currently the main limiting factor for the flexibility of the solver, because extra care must be taken when setting up a simulation to ensure proper alignment of immersed boundaries such as the synthetic plants used in this chapter.
This could be improved by using interpolated boundary conditions for LBM \cite{kruger:2017lattice}, they are however not yet implemented in our solver.
On the other hand, the MHFEM part of the solver can be used on complex domain geometries with unstructured mesh discretizations.
In this chapter, we use conforming unstructured cuboidal meshes for the discretization of the domain $\Omega_2$.
@@ -392,23 +387,30 @@ We use the full-way bounce-back boundary condition~\cite{kruger:2017lattice} to
The boundary condition for velocity on the downstream faces of synthetic plants (see \subsubsecref{sec:bc:plants}) is realized via the modified bounce-back condition~\cite{kruger:2017lattice} by specifying zero tangential and small non-zero normal velocity of the moving wall.
On the inflow, the discrete distribution functions are approximated by the discrete equilibrium functions evaluated from the known macroscopic variables \cite{latt2008,mohamad2009,haussmann2019}.
On the outflow, the extrapolation outflow boundary condition is used to approximate the discrete distribution functions.
\todo{Revise the description of boundary conditions -- either both LBM and MHFEM, or none.}
Based on the results from \cref{sec:lbm-mhfem:numerical analysis}, we use the linear interpolation of the velocity field and the explicit upwind scheme in MHFEM.
\todo{Add that the non-conservative formulation of the transport equation is used for the numerical solution.}
\section{Validation results}
\label{sec:results}
\subsection{Computational methodology}
All three experimental configurations were simulated up to the final time $t_{\max}=\SI{100}{\second}$ in three different resolutions, hereafter denoted as RES-1, RES-2, and RES-3.
All three experimental configurations were simulated up to the final time $t_{\max}=\SI{100}{\second}$ using single precision for air flow and double precision for vapor transport in three different resolutions, hereafter denoted as RES-1, RES-2, and RES-3.
A reference lattice and mesh were generated for the initial resolution RES-1 and the space step is halved with each subsequent resolution.
%The simulations were computed on a system with two AMD EPYC 7543 processors and four Nvidia Tesla A100 GPU accelerators with NVLink interconnection.
The simulations were computed on a system with two AMD EPYC 7763 processors and eight Nvidia Tesla A100 GPU accelerators with NVLink interconnection.
See~\cref{tab:resolutions} for the characteristics of each resolution and computational resources needed for the simulations.
\todo{Add reference to the RCI cluster}
\begin{table}[ht]
To match the experimental methodology described in~\cref{sec:experimental setup}, time-averaging was employed to produce statistical quantities such as the mean and variance of the flow velocity and relative humidity.
Time-averaging is implemented as part of the simulation code where statistical quantities are updated in every time step using the Welford's online algorithm~\cite{welford1962,ling1974,chan1983}.
Note that no space-averaging is applied to the simulation results.
\begin{table}[htb]
\centering
\caption{
Characteristics of each resolution used for presented simulations.
The computational times were achieved using 8 Nvidia Tesla A100 cards with NVLink interconnection.
The total computational time is broken down to cumulative contributions from LBM computation, velocity interpolation, and MHFEM computation; the remaining time includes initialization and output of the data.
}
\label{tab:resolutions}
\begin{tabular}{lrrr}
\toprule
& RES-1 & RES-2 & RES-3 \\
@@ -419,24 +421,44 @@ See~\cref{tab:resolutions} for the characteristics of each resolution and comput
Base time step $\Delta t$&\SI{1.33e-3}{\second}&\SI{3.33e-4}{\second}&\SI{8.32e-5}{\second}\\
\multicolumn{1}{p{13em}}{Average no. of LBM iters per MHFEM step ($\floor{C_{\max}/ C}$)}& 1 & 2 & 4 \\
\multicolumn{1}{p{12em}}{Average no. of LBM iters per MHFEM step ($\floor{C_{\max}/ C}$)}& 1 & 2 & 4 \\
Computational time & 10 min & 65 min & 15 h 12 min \\
-- LBM computation & 1 min & 11 min & 6 h \phantom{0}8 min \\
-- velocity interpolation & 46 s & 2 min & 30 min \\
-- MHFEM computation & 4 min & 30 min & 7 h 54 min \\
\bottomrule
\end{tabular}
\end{table}
\subsection{Computational performance analysis}
To demonstrate the computational efficiency of the implemented LBM-MHFEM solver, we performed a strong scaling study whose results are shown in \cref{tab:strong_scaling} for the experimental configuration EX-1 in the resolution RES-2.
Due to limitations of the computational system, the performance scaling analysis could be performed only with 8 GPUs.
Because of the coupling with MHFEM, the overall performance of the solver in GLUPS (giga-LUPS, \textit{billions of lattice updates per second}) is about 4-5$\times$ lower compared to \todo{Add reference to \cref{tab:lbm:strong scaling}} a standalone LBM solver, depending on the adaptively selected time steps.
The efficiency decreases with increasing the number of GPUs used in the computation, which is a typical behavior in strong scaling analyses caused by reduced work per GPU and increased communication-to-work ratio.
Considering that our implementation is limited by the one-dimensional domain decomposition of the lattice, the 80\% efficiency achieved on 8 GPUs is a satisfactory result.
Higher efficiency can be expected for weak scaling studies where the amount of work is kept proportional to the number of GPUs.
However, due to the adaptive time stepping strategy used in the coupled solver, it is not straightforward to analyze the weak scaling, because the performance of the solver depends on the number of time steps where MHFEM is executed, which would be different for each problem size.
\begin{table}[htb]
\centering
\caption{
Characteristics of each resolution used for presented simulations.
The computational times were achieved using 8 Nvidia Tesla A100 cards with NVLink interconnection.
The total computational time is broken down to cumulative contributions from LBM computation, velocity interpolation, and MHFEM computation; the remaining time includes initialization and output of the data.
Strong scaling of the coupled LBM-MHFEM solver for the scenario EX-1 in the resolution RES-2.
$N_{\mathrm{GPUs}}$ denotes the number of Nvidia Tesla A100 GPUs used in the computation, the Time column includes the computational time without initialization, the performance metric GLUPS stands for \textit{billions of lattice updates per second} and $E\!f\!f$ denotes the parallel efficiency.
}
\label{tab:resolutions}
\label{tab:strong_scaling}
\begin{tabular}{crrrr}
$N_{\mathrm{GPUs}}$& Time [min] & GLUPS &$E\!f\!f$\\
\midrule
1 & 392 & 1.0 & 1.00 \\
2 & 202 & 1.9 & 0.96 \\
4 & 110 & 3.7 & 0.92 \\
8 & 62 & 6.4 & 0.80 \\
\end{tabular}
\end{table}
To match the experimental methodology described in~\cref{sec:experimental setup}, time-averaging was employed to produce statistical quantities such as the mean and variance of the flow velocity and relative humidity.
Time-averaging is implemented as part of the simulation code where statistical quantities are updated in every time step using the Welford's online algorithm~\cite{welford1962,ling1974,chan1983}.
Note that no space-averaging is applied to the simulation results.
\section{Validation results}
\label{sec:results}
\subsection{On qualitative and quantitative comparisons of experiments and simulations}
@@ -461,8 +483,9 @@ A summary of possible factors contributing to experimental uncertainty include:
\item
The reliance on a sensor immersed in the flow to measure relative humidity is furthermore invasive; its presence disturbs the flow field locally and therefore the relative humidity distribution; the impact of the sensor was not quantified by \cite{trautz2017role}.
\item
The climate controls (i.e., heater, chiller, humidifier, dehumidifier) continuously fluctuated during the experiments, typically by no more than temperature \SI{\pm 1}{\degreeCelsius}, relative humidity \SI{\pm 3}{\percent}, and velocity \SI{\pm 0.05}{\metre\per\second}.
The climate controls (i.e., heater, chiller, humidifier, dehumidifier) continuously fluctuated during the experiments, typically by no more than temperature \SI{\pm 1}{\degreeCelsius}, relative humidity $\pm0.03$, and velocity \SI{\pm 0.05}{\metre\per\second}.
This can lead to momentary increases or decreases in any of the aforementioned atmospheric variables; this variability was accounted for in the model.
\todo{It is not clear how...}
\item
Uncertainty in the physical placement of the bluff bodies within the test-section, see \cref{sec:experimental setup}.
The model and physical experiments may therefore differ slightly. Note that the measurements did not explore variability in the transverse direction (i.e., $y$-axis).
@@ -488,7 +511,7 @@ The steady inflow velocity boundary condition leads to the development of a regi
When these results are compared with the experimental data (i.e., the vertical columns in \cref{fig:plot2D:const_vs_turb}), it is clear that the time-varying boundary condition captures the observed behavior better.
Hence, only the time-varying inflow boundary condition is considered for the results discussed hereafter.
\begin{figure}[ht]
\begin{figure}[htb]
a) EX-2: 45 cm spacing, time-constant inflow velocity profile \\
Quantitative comparison of relative humidity profiles ($\phi$) at the first position downstream of the first synthetic plant in the \SI{15}{\cm} spacing (EX-1).
The red bar above the graph highlights the position of the profile relative to the the synthetic plants (dark rectangles) and other measurement locations (thin gray bars).
@@ -648,7 +671,7 @@ In the cases EX-2 and EX-3 featuring different flow regimes, the graphs in \cref
Quantitative comparison of relative humidity profiles ($\phi$) at the first position downstream of the first synthetic plant in the \SI{45}{\cm} spacing (EX-2).
The red bar above the graph highlights the position of the profile relative to the the synthetic plants (dark rectangles) and other measurement locations (thin gray bars).
@@ -657,7 +680,7 @@ In the cases EX-2 and EX-3 featuring different flow regimes, the graphs in \cref
Quantitative comparison of relative humidity profiles ($\phi$) at the first position downstream of the first synthetic plant in the \SI{105}{\cm} spacing (EX-3).
The red bar above the graph highlights the position of the profile relative to the the synthetic plants (dark rectangles) and other measurement locations (thin gray bars).
@@ -665,39 +688,14 @@ In the cases EX-2 and EX-3 featuring different flow regimes, the graphs in \cref
\label{fig:plot1D:WT02_3:rh}
\end{figure}
\FloatBarrier
\subsection{Supplementary materials}
\inline{revise this section -- make an appendix?}
The supplementary materials for this paper include additional graphical results, including 2D flow fields simulated in all resolutions (RES-1, RES-2, RES-3) showing the effect of mesh resolution qualitatively, and 1D graphs comparing the velocity and relative humidity profiles quantitatively at all measurement locations.
\section{Computational performance analysis}
To demonstrate the computational efficiency of the implemented LBM-MHFEM solver, we performed a strong scaling study whose results are shown in \cref{tab:strong_scaling} for the experimental configuration EX-1 in the resolution RES-2.
Due to limitations of the computational system, the performance scaling analysis could be performed only with 8 GPUs.
Because of the coupling with MHFEM, the overall performance of the solver in GLUPS (giga-LUPS, \textit{billions of lattice updates per second}) is about 4-5$\times$ lower compared to a standalone LBM solver, depending on the adaptively selected time steps.
The efficiency decreases with increasing the number of GPUs used in the computation, which is a typical behavior in strong scaling analyses caused by reduced work per GPU and increased communication-to-work ratio.
Considering that our implementation is limited by the one-dimensional domain decomposition of the lattice, the 80\% efficiency achieved on 8 GPUs is a satisfactory result.
Higher efficiency can be expected for weak scaling studies where the amount of work is kept proportional to the number of GPUs.
However, due to the adaptive time stepping strategy used in the coupled solver, it is not straightforward to analyze the weak scaling, because the performance of the solver depends on the number of time steps where MHFEM is executed, which would be different for each problem size.
\begin{table}[!ht]
\centering
\begin{tabular}{crrrr}
$N_{\mathrm{GPUs}}$& Time [min] & GLUPS &$E\!f\!f$\\
\midrule
1 & 392 & 1.0 & 1.00 \\
2 & 202 & 1.9 & 0.96 \\
4 & 110 & 3.7 & 0.92 \\
8 & 62 & 6.4 & 0.80 \\
\end{tabular}
\caption{
Strong scaling of the coupled LBM-MHFEM solver for the scenario EX-1 in the resolution RES-2.
$N_{\mathrm{GPUs}}$ denotes the number of Nvidia Tesla A100 GPUs used in the computation, the Time column includes the computational time without initialization, the performance metric GLUPS stands for \textit{billions of lattice updates per second} and $E\!f\!f$ denotes the parallel efficiency.