We will first target a "purely" data-driven approach, in line with classic machine learning. We'll refer
to this as a supervised approach in the following, to indicate that the network is fully supervised
by data, and to distinguish it from using physics-based losses.
One of the central advantages of the supervised approach is that
we obtain a surrogate model (also called "emulator", or "Neural operator"),
i.e., a new function that mimics the behavior of the original
The purely data-driven, supervised training is the central starting point for all projects in the context of deep learning. While it can yield suboptimal results compared to approaches that more tightly couple with physics, it can be the only choice in certain application scenarios where no good model equations exist. In this chapter, we'll also go over the basics of different neural network architectures. Next to training methodology, this is an imporant choice.
For supervised training, we're faced with an
unknown function $f^(x)=y^
The
$$ \text{arg min}_{\theta} \sum_i \Big(f(x_i ; \theta)-y^*_i \Big)^2 . $$ (supervised-training)
This will give us
The training data typically needs to be of substantial size, and hence it is attractive
to use numerical simulations solving a physical model
On the other hand, this approach inherits the common challenges of replacing experiments with simulations: first, we need to ensure the chosen model has enough power to predict the behavior of the simulated phenomena that we're interested in. In addition, the numerical approximations have numerical errors which need to be kept small enough for a chosen application (otherwise even the best NN has no chance to be provide a useful answer later on). As these topics are studied in depth for classical simulations, and the existing knowledge can likewise be leveraged to set up DL training tasks.
---
height: 220px
name: supervised-training
---
A visual overview of supervised training. It's simple, and a good starting point
in comparison to the more complex variants we'll encounter later on.
The numerical approximations of PDE models for real world phenomena are often very expensive to compute. A trained NN on the other hand incurs a constant cost per evaluation, and is typically trivial to evaluate on specialized hardware such as GPUs or NN compute units.
Despite this, it's important to be careful:
NNs can quickly generate huge numbers of in between results. Consider a CNN layer with
An important decision to make at this stage is which neural network architecture to choose.