To
Model or not to Model
A reasonable fundamental
question of numerical modeling is: should we do it for a specific problem?
This may seem a little odd, but
believe me, modeling is a huge amount of work and we really need to answer this
question, based on what we know, and what do we want to achieve, before we even
start to think about any details.
I want make sure you realize
that this is a very important question that must be answered precisely. There
are several valid reasons for
modeling:
There are many more cases in
which models are very poor choices, note this is the general case in the
geosciences:
My first rule of modeling is
to: |
(1) only model when you
already know the answer! |
The reason for this we will
talk about in class, but a big part of the reason is the second rule of
modeling, namely that: |
|
(2) any (every?)
model has errors |
and that the larger the model or the more
unknown the model space, the more likely the errors are to be important. |
We can round this discussion
out with my 3rd and 4th rules: |
(3) the best use
for models in the geosciences is to investigate and understand the real question, not to find the answer |
After all, Science is mainly
about understanding the questions (Engineering is about finding answers) |
|
(4) everyone
should write one large model, but a smart scientist learns from that
experience |
(I leave it to you to figure
out what should be learned) |
Numerical
Approaches
I want to make an important
distinction between numerical modeling and numerical experiments. Researchers often carry out numerical or
analytical experiments, such as “back of the envelope” calculations, or scaling
arguments, or reduction of dimensions.
An experiment is what I call an investigation where the researcher
either doesn’t understand enough to model, or just wants to avoid the
complexity of modeling, but knows enough about a problem to ask “what if”
questions. The answers are used to shape
our thinking, but can not be strongly defended (and often turn out later to be
incorrect). Numerical experiments often
use similar (often simpler) code compared to numerical models, but should
represent a much greater spirit of investigation and usually require more
mature scientific thinking. By learning
how to do numerical models you will be in a much better position to be able to
segue into numerical experiments, and thereby increase your productivity. In my world-view, numerical experiments are
what scientists should aspire to, while most numerical modeling is best left to
engineers.
Understanding the difference
between modeling and experimenting, as well as between good and bad modeling is
good insurance in this day and age. It
allows you to protect yourself from the mass of scientific errors (quack,
hacks?) that hide behind unsound models.
As you can tell, bad modeling is a personal peeve of mine.
Models
I define a model as something
that abstracts (mimics a simplified) reality and allows us to manipulate
inputs, outputs and transfer functions that convert inputs into outputs. You
can make virtually any type of model, from physical models, to analog models,
to numerical models to whatever you feel like. And of course you can model just
about anything. However, in this course we make a strong distinction between
models that allow a strong degree of validation
and those that don't. Models of stock
market behavior, lotto etc. are outside this course because they are difficult
(impossible) to check and besides they violate the first rule. (Numerical experiments can’t usually be
validated either, but the actual results are only of passing interest, mainly
you are interested in the process.
However, you shouldn’t put too much faith in experiments either).
There are many ways you can approach
modeling of the world. One way is to break problems into discrete and
continuous, and to recognize that computers deal most easily with discrete
problems. However; historically computers
have not usually been used that way. The
majority of problems are set up as a continuum, in other words they are assumed
to be slowly varying smooth functions of space and time. This is a hold over from the world of
continuous mathematics. Most of the math
we learn is based on the concept of smooth curves and surfaces that are
generally differentiable, i.e. they have a defined slope and are not, except at
single points, kinked. The fact that we
actually go through a process of taking the real world, cast it into continuous
math and then try to model does not mean that this is the best approach. More and more, researchers are skipping the
intermediate step of casting the real world into math, and jumping straight to
cellular or inherently discrete models.
The benefits are great in terms of speed and complexity, but the costs
are huge in that we move further away from being able to prove our results to
be correct. Although note that most
proofs of any sort regarding numerical modeling are based on the underlying
continuous math, not on the numerics.
Validation of models is itself
a huge topic. Essentially it is impossible to validate a model of any
complexity. This would require a vast amount of input and output data spanning
the entire range of model response, which essentially makes the model moot.
However, if we restrict
ourselves to models that can be spot checked over at least part of the range,
we find that models generally require three essentials:
In a steady state model (one
that doesn't change with time), the boundary conditions (BCs) don't change. In
a transient problem (one that does change in time), the boundary conditions may
change, and in addition we must specify starting conditions or boundaries
referred to as initial conditions (ICs).
A Few Comments on the
Elements of Models
Many beginning modelers
underestimate the absolute necessity of BCs and ICs, and the over-riding need
to fully specify the domain. In fact, in most physical models, the boundary
conditions are more important than the kernel. This is a result of the fact
that in most physical processes, the values of the boundary conditions
propagate into the interior of the problem with time. A corollary of the above
is that modeling requires a lot of data on the domain and the BCs to get
accurate results. Typically, errors in output stem as much or more from a lack
of data than from errors in the model itself.
When you go to a talk, and are
impressed by the sophistication of the model, and the pretty output graphics,
be sure to ask yourself: was there really enough data at the input to reach
this output uniquely? (Mark Twain said it better in his famous quote).
The Kernel
Most models of geo problems
start with trying to describe some physical process in some mathematical form.
There are a variety of different approaches, but one of the oldest and most
successful approaches has been to look for a differential (or
integral-differential) equation whose response mimics the physical system.
Although this step is often glossed over, it is actually a difficult step to
make in any new process. As can be seen in the note (and our work) on heat
flow, one usually makes numerous simplifying assumptions in producing the
mathematical equation. It is very easy to lose sight of these assumptions after
the model is made, and it is very difficult to know which assumptions can be justified
while making the model, and which won't become important later on.
To a large extent, after you
have settled on a kernel equation, and assembled your data on BCs and the
solution domain, you have very little thinking left to do, merely a lot of hard
work. When most people think of modeling, they think of this final part of the
problem. However, most of the thinking should have gone into the first part.
Solving the kernel
equation
If you have ended up with a
differential equation as your kernel, you are lucky. There are many different
solution techniques. Two of the largest classes of solvers are those based on
Finite Differences and those based on Finite Elements. In Finite Difference
techniques, you construct an approximate numerical description of the kernel
equation, and then solve that approximation exactly. In Finite Element
techniques, you preserve (in theory) in exact equation, but seek solutions that
approximate it in some sense. F.E. methods are generally somewhat better
(faster and easier to focus in on a small sub-region), however F.D. methods are
generally easier to code initially and are often used in preliminary work and
non-production code.
If you don’t end up with a
simple equation in the kernel, then you need to be creative. We will investigate a couple of problems that
don’t lend themselves to pretty differential equation solutions.