Banach’s Fixed Point Theorem

The capstone project of my Real Analysis class was to pick a topic and learn it independently. I chose contraction mapping because of its importance in dynamic programming and optimization problems. I am grateful to many keen bloggers who wrote out more thorough proofs than many textbooks. Below is my contribution to the internet: a full write up of Banach’s Fixed Point Theorem, with no proofs left to the reader.

To start, let us review the definitions of contraction mapping and fixed points:

Let (S,d) be a metric space and f: X \to X. f is a contraction mapping if there exists a real number k \in [0,1), such that

    \[d(f(x),f(y))\leq kd(x,y)\]

\forall x, y \in X, where the term k is called a Lipschitz coefficient for f.

Let (X,d) be a complete metric space and let f: X \to X be a contraction. There there is one unique fixed point x^* such that

    \[f(x^*) = x^*\]

The trick to this proof is to claim an inductive sequence f^{(n)}(x) by composing f with f some n times.
We show that the set \{f^{(n)}(x)\} is convergent to a limit by showing it is Cauchy in a complete metric space. That limit point will be our unique fixed point.

Because we have a complete metric space, we can use the following triangle inequality \forall x, y \in X,

    \begin{align*} d(x, y) \leq d(x, f(x)) + d(f(x), f(y)) + d(f(y), y) \\ \leq d(f(x), x) + kd(x, y) + d(f(y), y) \end{align*}

Because f is a contraction, we use the definition to obtain the second inequality.

    \begin{align*} d(x, y) \leq \frac{1}{1 - k}(d(f(x), x) + d(f(y), y)) \end{align*}

Consider two points in the sequence (f^{(n)}(x)) such as f^{(m)}(x), f^{(l)}(x). We can bound their distance using the simplified inequality from above by substituting x, y such that

    \begin{align*} d(f^{(m)}(x), f^{(l)}(x)) \leq \frac{1}{1-k}[d(f^{(m+1)}(x), f^{(m)}(x)) + d(f^{(l+1)}(x), f^{(l)}(x))] \\ \leq \frac{1}{1-k}[k^md(f(x), x) + k^ld(f(x), x)] \end{align*}

We can prove the second inequality through a proof by induction. See the Appendix. For now we conclude

    \begin{align*} \frac{1}{1-k}[k^md(f(x), x) + k^ld(f(x), x)] = \frac{k^m + k^l}{1 - k}d(f(x), x) \\ d(f^{(m)}(x), f^{(l)}(x) \leq \frac{k^m + k^l}{1 - k}d(f(x), x) < \epsilon \end{align*}

where \epsilon > 0. We know d(f(x), x) > 0 by the metric space definition and the fraction \frac{k^m + k^l}{1 - k} > 0 because k is bounded by [0, 1). Therefore, our arbitrary points, d(f^{(m)}(x), f^{(l)}(x) is a Cauchy sequence for some m, l > N. Cauchy sequences are convergent in a complete space so there exists

    \[\lim_{m\to \infty} f^{(m)}(x) = x^*\]

The next step is to show that x^* is a fixed point. Recall the definition of fixed point f(x^*) = x^*. Since we can now define x^*, let us substitute the following

    \begin{align*} f(x^*) = f(\lim_{m\to \infty} f^{(m)}(x)) \\ = \lim_{m\to \infty} f(f^{(m)}(x)) \\ = \lim_{m\to \infty} f^{(m)}(x) \\ = x^* \end{align*}

These steps work because we f(f^{(m)}) = f^{(m + 1)} so the limit of one is equal to the limit of the other. This shows x^* is a fixed point. Lastly, we must show this point is unique. Suppose we have another fixed point y^* such that x^* \neq y^*, then we have

    \begin{align*} 0 < d(x^*, y^*) = d(f(x^*), f(y^*)) \leq kd(x^*, y^*) < d(x^*, y^*) \end{align*}

This is a contradiction because d(x^*, y^*) cannot be both greater and lower than itself. Therefore we claim f(x^*) = x^* is a unique fixed point.

Appendix

This section is focusing on the claim that

    \begin{align*} \frac{1}{1-k}[d(f^{(m+1)}(x), f^{(m)}(x)) + d(f^{(l+1)}(x), f^{(l)}(x))] \leq \\ \frac{1}{1-k}[k^md(f(x), x) + k^ld(f(x), x)] \end{align*}

The condition we show is

    \[d(f^{(m+1)}(x), f^{(m)}(x)) \leq k^md(f(x), x)\]

We hope to do this intuitively by showing a pattern and then claiming by induction that this is true for any m, l > N

    \begin{align*} d(f^{(m+1)}(x), f^{(m)}(x)) \leq k^1d(f^m(x), f^{(m-1)}(x)) \end{align*}

This inequality is from the definition of contraction mapping. Let us extend this inequality

    \begin{align*} k^1d(f^m(x), f^{(m-1)}(x)) \leq k^2d(f^{(m-1)}(x), f^{(m-2)}(x)) \\ \leq k^3d(f^{(m-2)}(x), f^{(m-3)}(x)) \\ \vdots \\ \leq k^md(f^{(m-(m-1))}(x), f^{(m-m)}(x)) \end{align*}

Notice the pattern of the k power. One less of the k power is the nth iteration of the first function and the k power matches the nth interation of the second function. By simplifying the last inequality, we have

    \[d(f^{(m+1)}(x), f^{(m)}(x)) \leq k^md(f(x),x)\]