Mathematical logic has a categorization of sentences in terms of increasing complexity called the Arithmetic Hierarchy. This hierarchy defines sets of sentences and
for all nonnegative integers
. The definition is as follows:
and
are both equal to the set of sentences
such that a computer can determine the truth or falsity of
in finite time.Sentences in
are essentially computations, things like
or “For all
less than a million, if
then
.”
for
is defined as follows: If, for all
,
is a
sentence, then
is a
sentence.
- Similarly, if for all
,
is a
sentence, then
is a
sentence.
It’s not obvious from the definition, but by encoding pairs, triples, etc. of natural numbers as single natural numbers, you are allowed to include multiple instances of the same quantifier without moving up the hierarchy. For example, a sentence is
if
is
. Essentially
contains all statements with
quantifier alternations where the outermost quantifier is a
and
contains all statements with
quantifier alternations where the outermost quantifier is an
.
Many number-theoretic and combinatorial statements are : Fermat’s last theorem, Goldbach’s conjecture, and the Riemann Hypothesis (which can be seen to be in
by a formulation of Jeffrey Lagarias). By encoding finite groups as numbers, theorems like the classification of finite simple groups can also be seen to be
.
Note that statements of the form where
is some computable function are also in
since for a fixed
,
is checkable in finite time by a computer.
There are many sentences which are of the form in
for which we actually know a bound on
, so we get a sentence in
when we use that bound. For example, the statement that there are infinitely many primes is in
, since it can be written as “For all
, there exists an
such that
is prime”, but we also know a version that’s in
, for example, we know that there’s a prime between
and
for any
.
There is a theorem that the sentences in this hierarchy actually do get harder to decide the truth of in general:
Theorem: For any
: even in a programming language augmented with the magical ability to decide the truth of
sentences, it is not possible to write a program which decides the truth of every
sentence.
A Scientific Hierarchy
What if we used the above idea to classify scientific statements instead of number-theoretic ones? We would get a new hierarchy where:
and
are both equal to the set of sentences
such that a definite experiment can be performed which demonstrates the truth or falsity of
.
and
for
are defined just as before.
Examples of statements in this classification would be things like: Any pair of objects dropped in a vacuum will fall with the same acceleration; or given any gram of liquid water at a standard pressure, adding 4.18 joules of energy will raise it 1 degree Celsius.
Of course, those examples are idealized: to make them actually true, you have to add in lots more caveats and tolerances: for example, you have to say that any pair of objects weighing no more than such-and-such will fall with the same acceleration up to such-and-such tolerance (and add in various other caveats). More on this later.
An example of a scientific statement might be something like: if you put any two objects in thermal contact, then for any given tolerance, there will be a time at which their temperatures are equal to within that tolerance.
You might have to use a statement of complexity to say something about randomness. For example, a
scientific statement might be: Suppose you are doing an experiment where you acquire a carbon-11 atom, wait for it to decay to boron-11 and record the time, then repeat. Then, for any tolerance
, there will be an
such that for any
the proportion of the first
carbon-11 atoms that decayed in less than 20.334 minutes will be no more than
away from 0.5.
A Scientific Hierarchy Theorem?
In the Arithmetic Hierarchy case, we had a theorem saying that statements higher up the hierarchy were qualitatively harder to decide the truth of. Is there a similar theorem here?
In fact, you can make this precise (although, to be honest, I’m not sure what the cleanest way to do it is). In particular, statements are learnable in the limit of acquiring all the experimental data, while
statements aren’t. One way to make this rigorous is: Suppose you have a
statement
, where for any object
(assume that
ranges over some countable set
of physical objects),
can be established or refuted by a single experiment.
Now, suppose that you have a probability distribution over all possibilities for for objects
: that is, you have a probability distribution
on the Cantor space where the elements of the sample space are to be interpreted as full specifications of whether or not
holds for each
: e.g., things like
or
.
Call events which are finite conjunctions of events of the form or
basic events.
Now say that this probability distribution is open-minded with respect to an event if, for any basic event
, if
and
are consistent, then
.
Now, assuming that is open-minded with respect to the event
, it’s pretty easy to show that
, where
. That is, if
is actually true, it will be learned. (Of course, if it is false, that will be learned in finite time automatically.)
On the other hand, for general statements, it’s pretty easy to see that this is not the case: In fact, any computable procedure assigning a probability for a statement
just based on seeing finitely many data points of the form
or
can be tricked by some value of
by using the definition of the computable procedure in
itself.
I would like to emphasize though, that the above are off-the-cuff basically-trivial observations. There must be a cleaner, nicer framework to state a scientific hierarchy theorem, but I don’t know what it is.
Science is Hard
In my opinion, thinking about the complexity of scientific statements in terms of quantifiers yields some insights.
For example, the case is basically the scientific method as learned in elementary school: a hypothesis is generated, and by repeated experiments we either reject it, or gradually come to correctly accept the truth of it.
The fact that science is not so simple is seen by just recognizing that there are scientific hypotheses that are not in form. However, probably even more significant is the fact that many real-world phenomena add quantifiers to our hypotheses:
- Measurement errors.As alluded to above, to make our scientific statements true, we have to add tolerances. If we don’t know what the tolerances should be when we make the hypothesis, we have to add at least one quantifier.
- Not everyone can perform the experimentsNot everyone performs experiments, in fact, most people rely on scientific consensus.
That automatically brings the hypothesis up to (at least), for example:
There exists a time, such that for all times
the scientific consensus at time
will be in favor of global warming (or evolution, or that eggs are good for you, or whatever).
Where can I go for more information?
I feel like thinking about the complexity of scientific hypotheses in terms of quantifier alternations is so natural that it must have been studied before, but I can’t find anything by googling around. Does anyone know where to find more information on this?