The Spectrum From Logic to Probability

Let $\Omega$ be the set of propositions considered by some rational logician (call her Sue). Further, suppose that $\Omega$ is closed under the propositional connectives $\vee$ , $\wedge$ , $\neg$ . Here are two related but different preorders on $\Omega$ :

$p\leq q$ if logically entails .
$p \preceq q$ if Sue considers at least as likely to be true as is.

Let $\sim$ be the equivalence relation defined by $p \sim q$ iff $p \leq q \wedge q \leq p$ and let $\approx$ similarly be defined by $p \approx q$ iff $p\preceq q\wedge q\preceq p$ .

Then we know what type of structure $\Omega/{\sim}$ is: since we’re assuming classical logic in this article, it’s a Boolean algebra. What type of structure is $\Omega/{\approx}$ ?

We can at least come up with a couple of examples. Since Sue is a perfect logician, it must be that if $p\leq q$ , then $p\preceq q$ . If Sue is extremely conservative, she may decline to offer opinions about whether one proposition is more likely to be true than another except when she’s forced to by logic. In this case, $\Omega/{\approx}$ is equal to $\Omega/{\sim}$ and therefore again a Boolean algebra.

In the other extreme, Sue may have opinions about every pair of propositions, making $\preceq$ a total ordering. A principal example of this is where $\Omega/{\approx}$ is isomorphic to a subset of [0,1] and Sue’s opinions about the propositions were generated by her assigning a probability $P(p)\in [0,1]$ to every proposition .

What’s in between on the spectrum from logic to probability? Are there totally ordered structures not isomorphic to [0,1] or a subset? More ambitiously: every Boolean algebra has operations $\vee$ , $\wedge$ , $\neg$ , while [0,1] has operations {+} , $\times$ , $(x\mapsto 1-x)$ which play similar roles in the computation of probabilities (note that is partial on [0,1] ). How are these related and does every structure on the spectrum from logic to probability have analogous operations?

These structures (i.e., structures of the form $\Omega/{\approx}$ for some acceptable $\preceq$ in a sense to be defined below) were called scales and defined and explored in a very nice paper by Michael Hardy.

The Definition of a Scale

Modding out by the equivalence relations once and for all, the general setup is that we have a map $\rho$ (induced by the identity function on $\Omega$ in the above setup) from a Boolean algebra $\mathbb{A}$ to a poset $\mathcal{R}$ . What should be true of $\rho$ ?

Since if a proposition logically entails a proposition , Sue will consider at least as likely to be true as , we should have that $x \leq y$ implies $\rho(x) \leq \rho(y)$ ( $\leq$ will now be the ordering in either $\mathbb{A}$ or $\mathcal{R}$ , depending on context). In fact, we should have that x < y implies $\rho(x) < \rho(y)$ .

Actually we should have more: For example, it should be the case that if $x \leq y$ , then $\rho(\neg y) \leq \rho(\neg x)$ . In general, if $\phi(x)$ is a propositional formula where appears negatively (that is, all occurrences of are negated in a normal form of $\phi$ ), then $x \leq y$ should imply $\rho(\phi(\neg y)) \leq \rho(\phi(\neg x))$ and the reverse is true if appears positively in $\phi$ . Furthermore, if $\phi(x)\ne \phi(y)$ we can require that the inequality be strict.

Finally, we should require that $\rho(\neg y)\leq\rho(\neg x)$ not just if $x \leq y$ , but even if it only holds that $\rho(x)\leq\rho(y)$ . That is, even if doesn’t logically entail , if you consider more likely to be true than , you should consider $\neg p$ more likely to be true than $\neg q$ . A similar generalization to $\phi(x)$ holds as above.

These considerations are equivalent to Hardy’s definition:

Let $\mathbb{A}$ be a Boolean algebra, $\mathcal{R}$ be a poset, and $\rho \colon \mathbb{A}\to\mathcal{R}$ . Then $\rho$ is called a basic scaling if:

$\rho$ is strictly increasing, so that implies $\rho(x) < \rho(y)$ .

$\rho$ preserves relative complementation, so that if $x,y \in [a,b]\subseteq \mathbb{A}$ and $\rho(x) < \rho(y)$ , then $\rho(\neg y_{[a,b]}) < \rho(\neg x_{[a,b]})$ , where $\neg x_{[a,b]}$ is the relative complement $a\vee (b\wedge \neg x)$ .

Hardy proves that the relative complement operation is well-defined on $\mathcal{R}$ , that is, that $\rho(\neg x_{[a,b]})$ depends only on $\rho(x)$ , $\rho(a)$ , and $\rho(b)$ . Note however, that it is a partial operation: even if $\theta < \eta < \mu$ in $\mathcal{R}$ , there is no guarantee that there $x, a, b\in\mathbb{A}$ such that $\rho(x) = \eta$ , $\rho(a) = \theta$ , $\rho(b) = \mu$ .

A scale is then defined as a poset together with a partial relative complement operation which is the range of a basic scaling.

An Example

Hardy’s paper gives many examples of scales, including a few pretty wild ones. Here’s one: Let $\mathbb{A}$ be the boolean algebra of subsets of $\mathbb{N}$ . Let $\rho(S) = \rho(T)$ iff S = T or $|S\setminus T| = |T \setminus S| < \aleph_0$ . Let $\rho(S) < \rho(T)$ iff $|S\setminus T| < |T\setminus S|$ . This defines a basic scaling to a scale $\mathcal{R}$ .

What does it look like? Every element except for $\rho(\emptyset)$ has an immediate predecessor, and every element except for $\rho(\mathbb{N})$ has an immediate successor. Therefore, it is partitioned into “galaxies” $\{\ldots,\alpha - 1, \alpha, \alpha + 1, \ldots\}$ together with in initial galaxy $\{\rho(\emptyset), \rho(\emptyset) + 1, \ldots\}$ and a final galaxy $\{\ldots, \rho(\mathbb{N}) - 1,\rho(\mathbb{N})\}$ . Between any two galaxies that are comparable, there are uncountably many galaxies and infinite antichains of galaxies.

Analogues of $\vee$ , $\wedge$ , $\neg$

We already know that there are appropriate analogues of $\neg$ in all scales, since we know that relative complementation carries over in a well-defined way from the domain Boolean algebra.

What about $\vee$ ? Hardy proves the following:

If $x \wedge y = 0$ in $\mathbb{A}$ , then $\rho(x \vee y)$ depends only on $\rho(x)$ and $\rho(y)$ . In this case we define $\rho(x) + \rho(y)$ to be $\rho(x\vee y)$ .

For $\theta, \eta\in \mathcal{R}$ , if $\theta + \eta$ exists then $\theta\leq \neg\eta$ .

It turns out that, for any $\theta \in\mathcal{R}$ , the operation $\eta\mapsto \theta + \eta$ is a partial injective map. Let be its inverse.

Hardy calls a scale divided if the necessary condition for $\theta + \eta$ existing given by (2) above is also sufficient. He proves:

For any divided scale, and , in Boolean algebra $\mathbb{A}$ , $\rho(x\vee y) = \rho(x) + (\rho(y) - \rho(x\wedge y)).$

In other words, all divided scales do have a operation, which satisfies the appropriate law from probability theory.

Finding an analogue of $\wedge$ or $\times$ is trickier, and, when he wrote the paper, Hardy only knew how to do it in the case that the scale is linearly ordered and Archimedean, defined as follows:

Let $\delta\in\mathbb{R}$ . Then $\delta$ is called infinitesimal if there is an infinite subset $\mathbb{X}\subseteq \mathbb{A}$ such that $x\wedge y = 0$ for $x, y\in\mathbb{X}$ such that $\delta < \rho(x)$ for all $x\in \mathbb{X}$ .

A scale is called Archimedean if it is divided and has no nonzero infinitesimals.

The idea behind the definition of infinitesimal is that, assigning the Boolean algebra a total measure of 1, the measures of the elements of $\mathbb{X}$ must approach 0.

In that case, you can define a division $\theta/\eta \in [0,\infty)$ as follows: Let n_0 be the maximum number of times $\eta$ can be subtracted from $\theta$ , let n_1 be the maximum number of times that result can be subtracted from $\eta$ , and so on. The quotient is then defined as the continued fraction:

$\displaystyle{n_0 + \frac{1}{n_1+\frac{1}{n_2+\cdots}}}$

Then the map $\theta \mapsto \theta/1$ maps the scale injective to a subscale of [0,1] (in particular, preserving ). Thus, $\times$ can be pulled back from its definition on [0,1] .