7 Types

Definition 7.1 Type (of distribution on \(\mathbb {R}\))

Two c.d.f.s \(F, G\) are said to be of the same type, if there exists an order-preserving affine isomorphism \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) such that \(G = A . F\).

Being of the same type is an equivalence relation, and the equivalence classes are called types (of distributions on \(\mathbb {R}\)).

7.1 Convergence to types

The notion of convergence of cumulative distribution functions considered here is always taken to be pointwise convergence on the set of continuity points of the limit c.d.f. By Theorem 4.8 and Lemma 4.9, this corresponds to convergence in distribution (weak convergence of probability measures). Below, when we write \(F_n \overset {\mathrm{d}}{\longrightarrow }F\) for c.d.f.s \(F_n\), \(n \in \mathbb {N}\), and \(F\), this is always what we mean.

For affine maps \(A \colon \mathbb {R}\to \mathbb {R}\), we use the topology of pointwise convergence of functions. Equivalently, convergence of affine maps means the convergence of the coefficients \(a, b \in \mathbb {R}\) in the expression \(x \mapsto a x + b\) of the affine functions (so \(A_n \to A\) if and only if the functions are of the form \(A_n(x) = a_n x + b_n\) and \(A(x) = a x + b\), and \(a_n \to a\) and \(b_n \to b\)).

Lemma 7.2 Unique affine relation among two nondegenerate c.d.f.s

Let \(F, G\) be two c.d.f.s of the same type, and \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) an affine isomorphism such that \(G = A.F\). If \(F\) is nondegenerate, then \(A\) is the only element of \(\mathrm{Aff}^+_{\mathbb {R}}\) for which the relation \(G = A.F\) holds.

Proof ▶

Since \(F\) is nondegenerate, we can find two different points \(x_1 {\lt} x_2\) such that \(0 {\lt} F(x_1) {\lt} F(x_2) \le 1\). By right continuity of \(F\), we can assume these points to be taken minimal with the given values, i.e., \(x_j = \inf \big\{ x \in \mathbb {R}\; \big| \; F(x) = F(x_j) \big\} \) for \(j=1,2\).

The assumption \(G = A.F\) means \(G(x) = F \big( A^{-1}(x) \big)\) for all \(x \in \mathbb {R}\). Therefore \(G \big( A(x_1) \big) = F(x_1) {\lt} F(x_2) = G \big( A(x_2) \big)\). We also get \(A(x_j) = \inf \big\{ y \in \mathbb {R}\; \big| \; G(y) = F(x_j) \big\} \) for \(j=1,2\) by strict monotonicity and bijectivity of \(A\) (if, for example, there would exist a \(y'_2 {\lt} A(x_2)\) such that \(G(y'_2) = F(x_2)\), then the point \(x'_2 = A^{-1}(y'_2) {\lt} x_2\) would be such that \(F\big(A^{-1}(y'_2)\big) = G(y'_2) = F(x_2)\), contradicting the minimality of \(x_2\)).

If \(\widetilde{A} \in \mathrm{Aff}^+_{\mathbb {R}}\) is also such that \(G = \widetilde{A}.F\), then the same holds for it: \(\widetilde{A}(x_j) = \inf \big\{ y \in \mathbb {R}\; \big| \; G(y) = F(x_j) \big\} \) for \(j=1,2\). We conclude that

\begin{align*} \widetilde{A}(x_1) \, = \; & \inf \big\{ y \in \mathbb {R}\; \big| \; G(y) = F(x_1) \big\} \; = \, A(x_1) \\ \widetilde{A}(x_2) \, = \; & \inf \big\{ y \in \mathbb {R}\; \big| \; G(y) = F(x_2) \big\} \; = \, A(x_2) . \end{align*}

But an affine map of \(\mathbb {R}\) is determined by its values at two distinct points: from \(a x_1 + b = y_1\) and \(a x_2 + b = y_2\) with \(x_1 \ne x_2\) one can solve \(a, b\). Therefore we must have \(\widetilde{A} = A\).

Lemma 7.3 Degeneration by shrinking affine transformations

Let \((F_n)_{n \in \mathbb {N}}\) be a sequence of c.d.f.s which converges to a c.d.f. \(G\), \(F_n \overset {\mathrm{d}}{\longrightarrow }G\). Consider affine transformations of the form \(A_n(x) = a_n x + b_n\), with \(a_n {\gt} 0\) and \(b_n \in \mathbb {R}\), such that \(a_n \to 0\) and \(b_n \to \beta \in \mathbb {R}\) as \(n \to \infty \). Then \(A_n . F_n \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\), where \(\widetilde{G}\) is the degenerate c.d.f. of the delta mass at \(\beta \).

Proof ▶

It suffices to prove that for any \(x {\lt} \beta \) we have \(\widetilde{G}(x) = 0\) and for any \(x {\gt} \beta \) we have \(\widetilde{G}(x) = 1\).

Let us focus on the latter, so let \(x {\gt} \beta \). Let also \(\varepsilon {\gt} 0\); we will prove that \(\widetilde{G}(x) {\gt} 1 - \varepsilon \), and the claim will follow.

By density of continuity points of \(\widetilde{G}\), we can choose \(x'\) such that \(\beta {\lt} x' {\lt} x\) and \(\widetilde{G}\) is continuous at \(x'\). Then the assumed convergence \(A_n . F_n \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\) implies that \((A_n . F_n)(x') \to \widetilde{G}(x')\). Since \(\widetilde{G}(x) \ge \widetilde{G}(x')\), it suffices to prove that \(\widetilde{G}(x') {\gt} 1 - \varepsilon \).

Since \(G\) is a c.d.f., we can choose a continuity point \(z\) of \(G\) large enough so that \(G(z) {\gt} 1 - \varepsilon \). Then by the assumed convergence \(F_n \overset {\mathrm{d}}{\longrightarrow }G\), we have \(F_n(z) \to G(z)\). By definition we have

\begin{align*} (A_n . F_n)(A_n(z)) \; = \; F_n \big( A_n^{-1} (A_n(z)) \big) \; = \; F_n(z) \; \longrightarrow \; G(z) . \end{align*}

Note that \(A_n(z) = a_n z + b_n \to \beta \) as \(n \to \infty \) by the assumptions \(a_n \to 0\) and \(b_n \to \beta \). In particular, for \(n\) large enough, we have \(A_n(z) {\lt} x'\). Therefore, for \(n\) large enough

\begin{align*} F_n(z) \; = \; (A_n . F_n) \big( A_n(z) \big) \; \le \; (A_n . F_n)(x’) . \end{align*}

The LHS tends to \(G(z)\) and the RHS tends to \(\widetilde{G}(x')\), showing

\begin{align*} 1 - \varepsilon \; {\lt} \; G(z) \; \le \; \widetilde{G}(x’) \; \le \; \widetilde{G}(x) . \end{align*}

This concludes the proof that \(\widetilde{G}(x) = 1\) for all \(x {\gt} \beta \).

The proof that \(\widetilde{G}(x) = 0\) for all \(x {\lt} \beta \) is similar.

Lemma 7.4 Impossibility of expanding affine transformations

✓

Let \((F_n)_{n \in \mathbb {N}}\) be a sequence of c.d.f.s which converges to a nondegenerate c.d.f. \(G\), \(F_n \overset {\mathrm{d}}{\longrightarrow }G\). Consider affine transformations of the form \(A_n(x) = a_n x + b_n\), with \(a_n {\gt} 0\) and \(b_n \in \mathbb {R}\), such that \(a_n \to +\infty \) as \(n \to \infty \). Then \(A_n . F_n\) cannot converge to any c.d.f.

Proof ▶

Since \(G\) is assumed nondegenerate, by Lemma 1.16 we can pick two continuity points \(x_1 {\lt} x_2\) of \(G\) such that \(0 {\lt} G(x_1) \le G(x_2) {\lt} 1\). Then from the assumption \(F_n \overset {\mathrm{d}}{\longrightarrow }G\) we get \(F_n(x_1) \to G(x_1)\) and \(F_n(x_2) \to G(x_2)\).

Assume, by way of contradiction, that we have convergence \(A_n . F \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\) to some c.d.f. \(\widetilde{G}\).

We claim that then \(\big(A_n(x_1)\big)_{n \in \mathbb {N}}\) is bounded from below and \(\big(A_n(x_2)\big)_{n \in \mathbb {N}}\) is bounded from above. Since

\begin{align*} a_n = \frac{A_n(x_2) - A_n(x_1)}{x_2 - x_1} , \end{align*}

this will show that \((a_n)_{n \in \mathbb {N}}\) is bounded from above, contradicting the assumption \(a_n \to +\infty \), and finishing the proof.

To show that \(\big(A_n(x_1)\big)_{n \in \mathbb {N}}\) is bounded from below, choose a continuity point \(z\) of \(\widetilde{G}\) such that \(\widetilde{G}(z) {\lt} G(x_1)\). Then the assumed convergence \(A_n . F \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\) implies \((A_n . F)(z) \to \widetilde{G}(z)\). On the other hand, if \(\big(A_n(x_1)\big)_{n \in \mathbb {N}}\) is not bounded from below, then along some subsequence \((n_k)_{k \in \mathbb {N}}\) of indices we have \(A_{n_k}(x_1) {\lt} z\), and for those indices we then have

\begin{align*} F_{n_k}(x_1) \; = \; (A_{n_k} . F_{n_k}) \big( A_{n_k}(x_1) \big) \; \le \; (A_{n_k} . F_{n_k})(z) . \end{align*}

The LHS tends to \(G(x_1)\) as \(k \to \infty \), whereas the RHS tends to \(\widetilde{G}(z)\). We get \(G(x_1) \le \widetilde{G}(z)\), contradicting the choice of \(z\). This shows that \(\big(A_n(x_1)\big)_{n \in \mathbb {N}}\) must in fact be bounded from below.

The proof that \(\big(A_n(x_2)\big)_{n \in \mathbb {N}}\) must be bounded from above is similar.

Theorem 7.5 Convergence to types

Suppose that \((F_n)_{n \in \mathbb {N}}\) is a sequence of c.d.f.s which converges to a nondegenerate c.d.f. \(G\), i.e., \(F_n \overset {\mathrm{d}}{\longrightarrow }G\) as \(n \to \infty \). Let \((A_n)_{n \in \mathbb {N}}\) be a sequence of oriented affine isomorphisms of \(\mathbb {R}\), \(A_n \in \mathrm{Aff}^+_{\mathbb {R}}\) such that \(A_n.F_n \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\) for some c.d.f. \(\widetilde{G}\).

If we write \(A_n(x) = a_n x + b_n\), then \((a_n)_{n \in \mathbb {N}}\) and \((b_n)_{n \in \mathbb {N}}\) are bounded sequences.

If \(\widetilde{G}\) is nondegenerate, then \(A_n \to A \in \mathrm{Aff}^+_{\mathbb {R}}\) and \(A.G = \widetilde{G}\). In particular \(G\) and \(\widetilde{G}\) are of the same type. Moreover, \(A\) is the unique affine transformation for which the equality \(A.G = \widetilde{G}\) holds.

Proof ▶

Let us first argue that \((a_n)_{n \in \mathbb {N}}\) are bounded. If not, then by passing to a subsequence, we have \(a_{n_k} \to +\infty \). But since \(F_{n_k} \overset {\mathrm{d}}{\longrightarrow }G\) and \(G\) is nondegenerate, it contradicts Lemma 7.4 to have \(A_{n_k} . F_{n_k} \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\). Therefore \((a_n)_{n \in \mathbb {N}}\) must be bounded: there exists some \(M{\gt}0\) such that \(a_n \le M\) for all \(n \in \mathbb {N}\).

Let us then argue that \((b_n)_{n \in \mathbb {N}}\) is bounded. If not, then we can extract a subsequence such that either \(b_{n_k} \to -\infty \) or \(b_{n_k} \to +\infty \). Let us prove the impossibility of the second one, the first is similar. So assume that \(b_{n_k} \to +\infty \). Since \(G\) is nondegenerate, we may pick a continuity point \(x_0\) of \(G\) such that \(0 {\lt} G(x_0) {\lt} 1\). Then we have \(F_n(x_0) \to G(x_0)\) by the assumption \(F_n \overset {\mathrm{d}}{\longrightarrow }G\). We may also pick a continuity point \(z\) of \(\widetilde{G}\) such that \(\widetilde{G}(z) {\gt} G(x_0)\). Then we have \((A_n.F_n)(z) \to \widetilde{G}(z)\) by the assumption \(A_n.F_n \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\). But \(A_{n_k}(x_0) = a_{n_k} x_0 + b_{n_k} \to +\infty \), since \(0 {\lt} a_{n_k} \le M\) and \(b_{n_k} \to +\infty \). Therefore we have for all large enough \(k\) that \(A_{n_k}(x_0) {\gt} z\). And then

\begin{align*} (A_{n_k}.F_{n_k})(z) \; \le \; (A_{n_k}.F_{n_k})\big(A_{n_k}(x_0)\big) \; = \; F_{n_k}(x_0) . \end{align*}

The LHS tends to \(\widetilde{G}(z)\) as \(k \to \infty \), and the RHS tends to \(G(x_0)\). Therefore we get \(\widetilde{G}(z) \le G(x_0)\), contradicting the choice of \(z\). This shows that we cannot have \(b_{n_k} \to +\infty \). Similarly one proves that we cannot have \(b_{n_k} \to -\infty \). We conclude that \((b_n)_{n \in \mathbb {N}}\) is indeed bounded.

From now on, suppose furthermore that also \(\widetilde{G}\) is nondegenerate. We claim that then \((a_n)_{n \in \mathbb {N}}\) is bounded away from \(0\): for some \(\varepsilon {\gt} 0\) we have \(a_n \ge \varepsilon \) for all \(n \in \mathbb {N}\). If not, then we could extract a subsequence such that \(a_{n_k} \to 0\) and also \(b_n \to \beta \) (since \(b_n\) is bounded). But since \(F_{n_k} \overset {\mathrm{d}}{\longrightarrow }G\) and \(G\) is nondegenerate, from Lemma 7.3 we would get \(A_{n_k} . F_{n_k} \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}_0\), where \(\widetilde{G}_0\) is degenerate. But by assumption \(A_{n_k} . F_{n_k} \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\), where \(\widetilde{G}\) is nondegenerate; this is impossible by uniqueness of limits for convergence in distribution. Therefore \((a_n)_{n \in \mathbb {N}}\) must indeed be bounded away from \(0\).

Note that since \((a_n)_{n \in \mathbb {N}}\) is bounded away from \(0\) and \(+\infty \), and \((b_n)_{n \in \mathbb {N}}\) is bounded, we can extract a subsequence such that \(a_{n_k} \to \alpha \in (0,+\infty )\) and \(b_{n_k} \to \beta \in \mathbb {R}\). By assumption, we have \(A_{n_k} . F_{n_k} \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\). But since \(A_{n_k} \to A\) where \(A(x) = \alpha x + \beta \) and we have also assumed \(F_{n_k} \overset {\mathrm{d}}{\longrightarrow }G\), this implies by continuity (Lemma 1.22) that \(A_{n_k} . F_{n_k} \overset {\mathrm{d}}{\longrightarrow }A . G\). By uniqueness of limits, we get \(A . G = \widetilde{G}\).

To prove that in fact \(A_n \to A\), not just along a subsequence, note the following. From any subsequence \(A_{n_k}\), we can extract a further convergent subsequence of values of \(a_{n_{k_\ell }}\) and \(b_{n_{k_\ell }}\) values as above, with limits \(\alpha ' \in (0,+\infty )\) and \(\beta ' \in \mathbb {R}\). The same argument as above shows that \(A' . G = \widetilde{G}\) where \(A'(x) = \alpha ' x + \beta '\). Lemma 7.2 then says that we must have \(A' = A\), i.e., \(\alpha ' = \alpha \) and \(\beta ' = \beta \). Since any subsequence has a convergent further subsequence with the same limit, the entire sequence must converge, \(A_n \to A\). The proof is complete.

Theorem 7.6 Convergence to types again

Let \((A_n)_{n \in \mathbb {N}}\) and \((\widetilde{A}_n)_{n \in \mathbb {N}}\) be two sequences of oriented affine isomorphisms of \(\mathbb {R}\), \(A_n, \widetilde{A}_n \in \mathrm{Aff}^+_{\mathbb {R}}\). Write \(A_n(x) = a_n x + b_n\) and \(\widetilde{A}_n(x) = \tilde{a}_n x + \tilde{b}_n\), and for the inverses \(A_n^{-1}(x) = c_n x + d_n\) and \(\widetilde{A}^{-1}_n(x) = \tilde{c}_n x + \tilde{d}_n\).

Let \((F_n)_{n \in \mathbb {N}}\) be a sequence of c.d.f.s such that \(A_n.F_n \overset {\mathrm{d}}{\longrightarrow }G\), with \(G\) a nondegenerate c.d.f.

Then the convergence of also \(\widetilde{A}_n.F_n \overset {\mathrm{d}}{\longrightarrow }G\) holds if and only if the coefficients of the affine maps satisfy the relations

\begin{align*} \frac{\tilde{a}_n}{a_n} \to 1 \quad \text{ and } \quad \frac{a_n \tilde{b}_n - \tilde{a}_n b_n}{a_n} \to 0, \end{align*}

or equivalently,

\begin{align*} \frac{\tilde{c}_n}{c_n} \to 1 \quad \text{ and } \quad \frac{\tilde{d}_n - d_n}{c_n} \to 0 . \end{align*}

Proof ▶

(This is actually just a special case of what is stated as Corollary 7.7 below. The better organization is to prove that corollary directly, and obtain this lemma as a special case.)

We will apply the convergence to types with the reference sequence \((A_n.F_n)_{n \in \mathbb {N}}\), which by assumption tends to a nondegenerate \(G\).

To express the other sequence in terms of the reference sequence, we write

\begin{align*} \widetilde{A}_n.F_n \; = \; (\widetilde{A}_n A_n^{-1}).(A_n.F_n) . \end{align*}

By assumption this also tends to \(G\).

Theorem 7.5 applies, and guarantees convergence \(\widetilde{A}_n A_n^{-1} \to A\) to some \(A \in \mathrm{Aff}^+_{\mathbb {R}}\), and it also implies \(A.G = G\) (note that the other limit is also \(G\) by our assumptions). However, the unique \(A\) for which we have \(A.G = G\) is \(A = \mathrm{id}\). We therefore get \(\widetilde{A}_n A_n^{-1} \to \operatorname{id}\). To explicitly see the coefficients, note that

\begin{align*} A_n^{-1}(x) \; = \; & a_n^{-1} (x - b_n) \end{align*}

and

\begin{align*} \widetilde{A}_n \big( A_n^{-1}(x) \big) \; = \; & \tilde{a}_n \big(a_n^{-1} (x - b_n)\big) + \tilde{b}_n \\ \; = \; & \tilde{a}_n a_n^{-1} x - \tilde{a}_n a_n^{-1} b_n + \tilde{b}_n . \end{align*}

The convergence \(\widetilde{A}_n A_n^{-1} \to \operatorname{id}\) is equivalent to the convergence of the coefficients,

\begin{align*} \tilde{a}_n a_n^{-1} \; \longrightarrow \; & 1 \\ - \tilde{a}_n a_n^{-1} b_n + \tilde{b}_n \; \longrightarrow \; & 0 . \end{align*}

The second one can be rewritten as

\begin{align*} \frac{a_n \tilde{b}_n - \tilde{a}_n b_n}{a_n} \; \longrightarrow \; & 0 . \end{align*}

Corollary 7.7 Convergence to types with different limits

Let \((F_n)_{n \in \mathbb {N}}\) be a sequence of c.d.f.s such that \(A_n.F_n \overset {\mathrm{d}}{\longrightarrow }G\) and \(\widetilde{A}_n.F_n \overset {\mathrm{d}}{\longrightarrow }\widetilde{G}\), with \(G\) and \(\widetilde{G}\) nondegenerate c.d.f.s. Then for some \(\alpha {\gt}0\) and \(\beta \in \mathbb {R}\) we have

\begin{align*} \frac{\tilde{a}_n}{a_n} \to \alpha \quad \text{ and } \quad \frac{a_n \tilde{b}_n - \tilde{a}_n b_n}{a_n} \to \beta , \end{align*}

and we have

\begin{align*} A.G = \widetilde{G} \qquad \text{ where } \; A(x) = \alpha x + \beta . \end{align*}

Equivalently, with \(\gamma = \alpha ^{-1}\) and \(\delta = -\alpha ^{-1} \beta \) so that \(A^{-1}(x) = \gamma x + \delta \), we have

\begin{align*} \frac{\tilde{c}_n}{c_n} \to \gamma \quad \text{ and } \quad \frac{\tilde{d}_n - d_n}{c_n} \to \delta . \end{align*}

In particular, \(G\) and \(\widetilde{G}\) have the same type.

Proof ▶

We will apply the convergence to types with the reference sequence \((A_n.F_n)_{n \in \mathbb {N}}\), which by assumption tends to a nondegenerate \(G\).

To express the other sequence in terms of the reference sequence, we write

\begin{align*} \widetilde{A}_n.F_n \; = \; (\widetilde{A}_n A_n^{-1}).(A_n.F_n) . \end{align*}

By assumption this tends to a nondegenerate \(\widetilde{G}\).

Theorem 7.5 applies, and guarantees convergence \(\widetilde{A}_n A_n^{-1} \to A\) to some \(A \in \mathrm{Aff}^+_{\mathbb {R}}\), and it also implies \(A.G = \widetilde{G}\).

Write \(A(x) = \alpha x + \beta \). To explicitly see the coefficients of \(\widetilde{A}_n A_n^{-1}\), note that

\begin{align*} A_n^{-1}(x) \; = \; & a_n^{-1} (x - b_n) \end{align*}

and

The convergence \(\widetilde{A}_n A_n^{-1} \to A\) is equivalent to the convergence of the coefficients,

\begin{align*} \tilde{a}_n a_n^{-1} \; \longrightarrow \; & \alpha \\ - \tilde{a}_n a_n^{-1} b_n + \tilde{b}_n \; \longrightarrow \; & \beta . \end{align*}

The second one can be rewritten as

\begin{align*} \frac{a_n \tilde{b}_n - \tilde{a}_n b_n}{a_n} \; \longrightarrow \; & \beta . \end{align*}

Lemma 7.8 A choice of normalizing constants for convergence to types

(It is possible to choose normalization constants for the affine transformations using the left-continuous inverses of the c.d.f.s. TODO: Precise statement.)

Proof ▶

…

7.2 One-parameter subgroups of affine transformations

Definition 7.9 Subgroup of translations

The mapping \(s \mapsto A_s\) with

\begin{align*} A_s(x) = x + s \end{align*}

is a homomorphism \(\mathbb {R}\to \mathrm{Aff}^+_{\mathbb {R}}\). The image of this homomorphism is the subgroup of translations in \(\mathrm{Aff}^+_{\mathbb {R}}\).

Lemma 7.10 Only translations have no fixed points

If \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) has no fixed points (no \(x \in \mathbb {R}\) such that \(A(x) = x\)) then \(A\) belongs to the subgroup of translations, i.e., \(A(x) = x + s\) for some \(s \in \mathbb {R}\) (in fact \(s \ne 0\)).

Proof ▶

Let us prove this by contrapositive: that any element \(A\) which is not a translation must have a fixed point. So assume that \(A\) is not a translation, i.e., \(A(x) = a x + b\) with some \(a \ne 1\). Then the fixed point equation \(A(x) = x\) reads

\begin{align*} a x + b = x , \end{align*}

and it has a solution \(x = \frac{-b}{a-1} \in \mathbb {R}\), which then is a fixed point of \(A\).

Lemma 7.11 Conjugate of translation is translation

Let \(A^{(\beta )}_s = x + \beta s\) for \(s, \beta \in \mathbb {R}\) as in Definition 7.9. Let also \(B \in \mathrm{Aff}^+_{\mathbb {R}}\) be given by \(B(x) = a x + b\). Then

\begin{align*} B \, A^{(\beta )}_s \, B^{-1} = A^{(a \beta )}_{s} . \end{align*}

Proof ▶

Calculate, for \(x \in \mathbb {R}\)

\begin{align*} (B A^{(\beta )}_s B^{-1})(x) \; = \; & (B A^{(\beta )}_s)\big( \frac{x-b}{a} \big) \\ \; = \; & B\big( \frac{x-b}{a} + \beta s \big) \\ \; = \; & a \big( \frac{x-b}{a} + \beta s \big) + b \\ \; = \; & x - b + a \beta s + b \\ \; = \; & x + a \beta s \; = \; A^{(a \beta )}_s(x). \end{align*}

Definition 7.12 Subgroup fixing a point

The mapping \(s \mapsto A_s\) with

\begin{align*} A_s(x) = e^{s} (x - c) + c \end{align*}

is a homomorphism \(\mathbb {R}\to \mathrm{Aff}^+_{\mathbb {R}}\). The image of this homomorphism is the subgroup fixing \(c\) in \(\mathrm{Aff}^+_{\mathbb {R}}\).

Lemma 7.13 Characterization of the subgroup fixing a point

An orientation-preserving affine transformation \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) belongs to the subgroup fixing \(c \in \mathbb {R}\) if and only if \(A(c) = c\).

(Note that the subgroup is a priori defined as the image of a homomorphism, so the statement indeed requires a proof.)

Proof ▶

Suppose first that \(A\) is an element of the said subgroup, i.e., \(A(x) = e^{s} (x - c) + c\) for some \(s \in \mathbb {R}\). Then clearly \(A(c) = c\).

Suppose then that \(A(c) = c\). Write \(A(x) = a x + b\) for \(a{\gt}0\) and \(b \in \mathbb {R}\). Plug in \(x=c\) in the assumed fixed point property to obtain

\begin{align*} a c + b = c . \end{align*}

The above can be solved to give \(b = (1-a)c\). Also since \(a{\gt}0\), we can write \(a = e^s\) with \(s \in \mathbb {R}\). With these, the formula for \(A\) simplifies to

\begin{align*} A(x) = e^s x + c (1 - e^s) = e^s \big( x - c \big) + c . \end{align*}

This shows \(A = A_s\) as desired (with \(A_s\) as in Definition 7.12).

Lemma 7.14 Conjugate of fixing is fixing image

Let \(A^{(\alpha ;c)}_s = e^{\alpha s} (x - c) + c\) for \(\alpha , c \in \mathbb {R}\) as in Definition 7.12. Let also \(B \in \mathrm{Aff}^+_{\mathbb {R}}\) be given by \(B(x) = a x + b\). Then

\begin{align*} B \, A^{(\alpha ;c)}_s \, B^{-1} = A^{(\alpha ;B(c))}_{s} . \end{align*}

Proof ▶

Calculate, for \(x \in \mathbb {R}\)

\begin{align*} (B A^{(\alpha ;c)}_s B^{-1})(x) \; = \; & (B A^{(\alpha ;c)}_s)\big( \frac{x-b}{a} \big) \\ \; = \; & B\big( e^{\alpha s} (\frac{x-b}{a} - c) + c \big) \\ \; = \; & a e^{\alpha s} (\frac{x-b}{a} - c) + a c + b \\ \; = \; & e^{\alpha s} \big( x - (a c + b) \big) + (a c + b) \\ \; = \; & e^{\alpha s} \big( x - B(c) \big) + B(c) \\ \; = \; & A^{(\alpha ;B(c))}_s (x) . \end{align*}

7.3 Self-similarity characterizations of the extreme value distributions

Lemma 7.15 Continuous parameter extreme value limit relation

Let \(F\) be a c.d.f.

(Note that below we use the sequence \((F^n)_{n \in \mathbb {N}}\) of \(n\)th powers of a fixed c.d.f., not a sequence of arbitrary c.d.f.s. Recall that the \(n\)th power \(F^n\) is the c.d.f. of the maximum of \(n\) independent random variables with the distribution \(F\).)

Suppose that for a sequence \((A_n)_{n \in \mathbb {N}}\) of oriented affine isomorphisms of \(\mathbb {R}\), \(A_n \in \mathrm{Aff}^+_{\mathbb {R}}\), we have

\begin{align*} A_n.F^n \overset {\mathrm{d}}{\longrightarrow }G , \end{align*}

where \(G\) is a c.d.f.

Then, for any \(t {\gt} 0\), denoting by \(G^t\) the c.d.f. given by \(G^t(x) = \big( G(x) \big)^t\), we have

\begin{align*} A_{n}.F^{\lfloor n t \rfloor } \overset {\mathrm{d}}{\longrightarrow }G^t , \end{align*}

where, for \(x \in \mathbb {R}\), the floor notation \(\lfloor x \rfloor \) stands for the greatest integer \(k \in \mathbb {Z}\) such that \(k \le x\).

Proof ▶

Let \(t {\gt} 0\) and let \(x \in \mathbb {R}\) be a continuity point of \(G\). For \(n \in \mathbb {N}\), calculate

\begin{align*} \big((A_n . F) (x) \big)^{\lfloor n t \rfloor } \; = \; & \Big( \big((A_n . F) (x) \big)^n \Big)^{\lfloor n t \rfloor / n} . \end{align*}

By assumption, we have \(\big((A_n . F) (x) \big)^n \to G(x)\) as \(n \to \infty \). Also \(\lfloor n t \rfloor / n \to t\) as \(n \to \infty \). By (joint) continuity of the power function \((x,y) \mapsto x^y = \exp \big( y \, \log (x) \big)\), we get that the expression above tends to \(G(x)^t\).

Finally noting that the continuity points of \(G^t\) are the same as the continuity points of \(G\), the above in fact proves the asserted \(A_{n}.F^{\lfloor n t \rfloor } \overset {\mathrm{d}}{\longrightarrow }G^t\).

Lemma 7.16 Self-similarity of extreme value distributions

Suppose that \(G\) is an extreme-value distribution. Then there exists a family \((A_t)_{t {\gt} 0}\) of oriented affine isomorphisms of \(\mathbb {R}\), \(A_t \in \mathrm{Aff}^+_{\mathbb {R}}\), such that for any \(t {\gt} 0\)

\begin{align*} G^t = A_t . G . \end{align*}

Moreover, \(t \mapsto A_t\) is a measurable homomorphism of multiplicative groups \((0,+\infty ) \to \mathrm{Aff}^+_{\mathbb {R}}\).

Proof ▶

…

Lemma 7.17 Self-similar continuous c.d.f. family characterization \(\gamma = 0\)

Suppose that \(G\) is a nondegenerate c.d.f. such that

\begin{align*} G^{t} = A_t . G \qquad \text{ for any } t {\gt} 0 , \end{align*}

where

\begin{align*} A_t(x) = x + \beta \, \log t \end{align*}

with \(\beta {\gt} 0\).

Then with \(d = \log \big(-\log G(0) \big)\), for all \(x \in \mathbb {R}\) we have

\begin{align*} G(x) = \exp \Big( - \exp \big( -\beta ^{-1} x + d \big) \Big) . \end{align*}

(In particular, \(G\) is of Gumbel type: there exists an \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) such that \(G = A.\Lambda \).)

Proof ▶

Since \(G\) is nondegenerate, there exists an \(x_0 \in \mathbb {R}\) with \(0 {\lt} G(x_0) {\lt} 1\). Write \(q = -\log G(x_0) {\gt} 0\), so that \(G(x_0) = e^{-q}\).

For \(t {\gt} 0\), from the equation \(G^{t} = A_t . G\) we get for any \(x \in \mathbb {R}\)

\begin{align*} G(x)^t = G \big( A_t^{-1}(x) \big) = G \big( x - \beta \, \log (t) \big) . \end{align*}

In particular, with \(x = x_0 + \beta \, \log (t)\), we get

\begin{align*} G\big( x_0 + \beta \, \log (t) \big)^{t} = G(x_0) = e^{- q}, \end{align*}

from which we can solve

\begin{align*} G\big( x_0 + \beta \, \log (t) \big) = e^{-q/t} . \end{align*}

The above holds for any \(t{\gt}0\), and any \(x \in \mathbb {R}\) can be written as \(x = x_0 + \beta \, \log \big( e^{(x - x_0)/\beta } \big)\). We therefore get, for any \(x \in \mathbb {R}\),

\begin{align*} G(x) = \exp \Big( - q \, \exp \big( -(x-x_0)/\beta \big) \Big) . \end{align*}

This is of the desired form, with \(d = \frac{x_0}{\beta } + \log (q)\). Plugging in \(x = 0\) then shows \(d = \log \big(-\log G(0) \big)\).

Lemma 7.18 Self-similar continuous c.d.f. family characterization \(\gamma {\gt} 0\)

Suppose that \(G\) is a nondegenerate c.d.f. such that

\begin{align*} G^{t} = A_t . G \qquad \text{ for any } t {\gt} 0 , \end{align*}

where

\begin{align*} A_t(x) = t^{\alpha } x + c \, (1 - t^{\alpha }) = t^{\alpha } (x-c) + c \end{align*}

with \(c \in \mathbb {R}\) and \(\alpha {\gt} 0\).

Then with \(\sigma = \big(- \log G(c+1)\big)^{\alpha }\), for all \(x \ge c\) we have

\begin{align*} G(x) = \exp \Big( - \big( \frac{x - c}{\sigma } \big)^{-1/\alpha } \big) \Big) . \end{align*}

(It easily follows that \(G\) is of Fréchet type: there exists an \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) such that \(G = A.\Phi _{\alpha }\).)

Proof ▶

Since \(G\) is nondegenerate, there exists an \(x_0 \in \mathbb {R}\) with \(0 {\lt} G(x_0) {\lt} 1\). Write \(q = -\log G(x_0) {\gt} 0\), so that \(G(x_0) = e^{-q}\).

For \(t {\gt} 0\), from the equation \(G^{t} = A_t . G\) we get for any \(x \in \mathbb {R}\)

\begin{align*} G(x)^t = G \big( A_t^{-1}(x) \big) . \end{align*}

Note that for \(x \le c\) and \(t = 2\) we have \(A_2^{-1}(x) = 2^{-\alpha }(x-c) + c \ge x\), so the above implies \(G(x)^2 = G(A_2^{-1}(x)) \ge G(x)\). This not possible if \(0 {\lt} G(x) {\lt} 1\), so we must in particular have \(x_0 {\gt} c\).

With \(x = A_t(x_0)\) in the above equation, we get

\begin{align*} G\big( A_t(x_0) \big)^{t} = G(x_0) = e^{- q}, \end{align*}

from which we can solve

\begin{align*} G(x) = G\big( A_t(x_0) \big) = e^{-q/t} . \end{align*}

Any \(x \ge c\) can be written as \(x = A_t(x_0) = t^{\alpha } (x_0-c) + c\) with \(t = \big( \frac{x - c}{x_0 - c} \big)^{1/\alpha }\) (recall that \(x_0 - c {\gt} 0\)). We therefore get, for any \(x \ge c\),

\begin{align*} G(x) = \exp \Big( - q \, \big( \frac{x - c}{x_0 - c} \big)^{-1/\alpha } \big) \Big) . \end{align*}

This is of the form

\begin{align*} G(x) = \exp \Big( - \big( \frac{x - c}{\sigma } \big)^{-1/\alpha } \big) \Big) , \end{align*}

and plugging in \(x = c + 1\) yields \(\sigma = \big(- \log G(c+1)\big)^{\alpha }\).

Lemma 7.19 Self-similar continuous c.d.f. family characterization \(\gamma {\lt} 0\)

Suppose that \(G\) is a nondegenerate c.d.f. such that

\begin{align*} G^{t} = A_t . G \qquad \text{ for any } t {\gt} 0 , \end{align*}

where

\begin{align*} A_t(x) = t^{-\alpha } x + c \, (1 - t^{-\alpha }) = t^{-\alpha } (x-c) + c \end{align*}

with \(c \in \mathbb {R}\) and \(\alpha {\gt} 0\).

Then with \(\sigma = \big(- \log G(c-1)\big)^{-\alpha }\), for all \(x \le c\) we have

\begin{align*} G(x) = \exp \Big( - \big( \frac{c - x}{\sigma } \big)^{1/\alpha } \big) \Big) . \end{align*}

(It easily follows that \(G\) is of Weibull type: there exists an \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) such that \(G = A.\Psi _{\alpha }\).)

Proof ▶

(Note: The Lean formalized statement uses the opposite sign of \(\alpha \): it is assumed that \(\alpha {\lt} 0\) and \(A_t(x) = t^{+\alpha } (x-c) + c\).)

Since \(G\) is nondegenerate, there exists an \(x_0 \in \mathbb {R}\) with \(0 {\lt} G(x_0) {\lt} 1\). Write \(q = -\log G(x_0) {\gt} 0\), so that \(G(x_0) = e^{-q}\).

For \(t {\gt} 0\), from the equation \(G^{t} = A_t . G\) we get for any \(x \in \mathbb {R}\)

\begin{align*} G(x)^t = G \big( A_t^{-1}(x) \big) . \end{align*}

Note that for \(x \ge c\) and \(t = 2\) we have \(A_2^{-1}(x) = 2^{\alpha }(x-c) + c \ge x\), so the above implies \(G(x)^2 = G(A_2^{-1}(x)) \ge G(x)\). This not possible if \(0 {\lt} G(x) {\lt} 1\), so we must in particular have \(x_0 {\lt} c\).

With \(x = A_t(x_0)\) in the above equation, we get

\begin{align*} G\big( A_t(x_0) \big)^{t} = G(x_0) = e^{- q}, \end{align*}

from which we can solve

\begin{align*} G(x) = G\big( A_t(x_0) \big) = e^{-q/t} . \end{align*}

Any \(x \le c\) can be written as \(x = A_t(x_0) = t^{-\alpha } (x_0-c) + c\) with \(t = \big( \frac{c - x}{c - x_0} \big)^{-1/\alpha }\) (recall that \(c - x_0 {\gt} 0\)). We therefore get, for any \(x \le c\),

\begin{align*} G(x) = \exp \Big( - q \, \big( \frac{c - x}{c - x_0} \big)^{1/\alpha } \big) \Big) . \end{align*}

This is of the form

\begin{align*} G(x) = \exp \Big( - \big( \frac{c - x}{\sigma } \big)^{1/\alpha } \big) \Big) , \end{align*}

and plugging in \(x = c - 1\) yields \(\sigma = \big(- \log G(c-1)\big)^{-\alpha }\).

Theorem 7.20 Three types of extreme value distributions [Fisher-Tippett-Gnedenko]

✓

For any extreme value distribution \(G\), one of the following holds:

: (\(\Lambda \)) \(G = A . \Lambda \) for some \(A \in \mathrm{Aff}^+_{\mathbb {R}}\);
: (\(\Phi _{{}}\)) \(G = A . \Phi _{\alpha }\) for some \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) and \(\alpha {\gt} 0\);
: (\(\Psi _{{}}\)) \(G = A . \Psi _{\alpha }\) for some \(A \in \mathrm{Aff}^+_{\mathbb {R}}\) and \(\alpha {\gt} 0\).

In particular, the only three possible types of extreme value distributions are the type of the Gumbel c.d.f., the type of the Fréchet c.d.f. \(\Phi _{\alpha }\) for \(\alpha {\gt} 0\), and the type of the Weibull c.d.f. \(\Psi _{\alpha }\) for \(\alpha {\gt} 0\).

Proof ▶

…