## Condition number of a matrix: basics

A special kind of norm for \(n\times n\) square matrices \(\mathrm{A}\) is called the *subordinate* (or *induced*) norm.

If we have a norm \(\lVert\cdot\rVert_\alpha\) on the input space, and a second norm \(\lVert\cdot\rVert_\beta\) on the output space, the subordinate norm is

$$

\lVert\mathrm{A}\rVert_{\alpha, \beta} = \sup_{x \neq 0}\dfrac{\;\lVert\mathrm{A}x\rVert_{\beta}}{\;\lVert x\rVert_\alpha}

$$

## Submultiplicative property

An important property of these induced norms is that for any third norm \(\lVert\cdot\rVert_{\gamma}\), we have:

$$

\lVert\mathrm{AB}\rVert_{\alpha, \beta} \leq \lVert\mathrm{A}\rVert_{\gamma, \beta} \lVert\mathrm{B}\rVert_{\alpha, \gamma}

$$

This is easily proven via the computations

$$

\dfrac{\;\lVert\mathrm{ABx}\rVert_{\beta}}{\;\lVert x\rVert_\alpha}

=

\dfrac{\;\lVert\mathrm{A}y\rVert_{\beta}}{\;\lVert y\rVert_\gamma}

\dfrac{\;\lVert\mathrm{B}x\rVert_{\gamma}}{\;\lVert x\rVert_\gamma}

$$

for \(y = \mathrm{B}x\).

As each supremum on the right is an upper bound, we get the inequality

$$

\dfrac{\;\lVert\mathrm{ABx}\rVert_{\beta}}{\;\lVert x\rVert_\alpha}

\leq

\lVert\mathrm{A}\rVert_{\gamma, \beta}

\lVert\mathrm{B}\rVert_{\alpha, \gamma}

$$

we get the final equality because the supremum is the least upper bound.

A special case is when taking \(\alpha = \beta\), which yields that

$$\lVert\mathrm{AB}\rVert_{\alpha, \alpha} \leq \lVert\mathrm{A}\rVert_{\alpha, \alpha} \lVert\mathrm{B}\rVert_{\alpha, \alpha}

$$

we denote this norm \(\lVert\cdot\rVert_\alpha\).

## Condition number

The condition number \(\kappa_\alpha(\mathrm{A})\) of a matrix is the quantity

$$\kappa_\alpha(\mathrm{A}) = \lVert\mathrm{A}\rVert_\alpha\lVert\mathrm{A}^{-1}\rVert_\alpha $$

which is larger than \(1\) because

$$

1 = \lVert\mathrm{I}\rVert_\alpha = \lVert\mathrm{A}\mathrm{A}^{-1}\rVert_\alpha \leq \lVert\mathrm{A}\rVert_\alpha\lVert\mathrm{A}^{-1}\rVert_\alpha

$$

this number is large when either \(\mathrm{A}\) or its inverse creates large distortions as the input varies on the unit circle.

## Stability of a similarity

In linear algebra, a matrix \(\mathrm{B}\) is said to be similar to \(\mathrm{A}\) when there is an invertible transformation \(\mathrm{X}\) such that \(\mathrm{B} = \mathrm{X}\mathrm{A}\mathrm{X}^{-1}\).

In numerical analyses, it is often that we wish to know the robustness of an operation \(f\colon x \mapsto f(x)\) with respect to a small perturbation of its input: if \(y = f(x)\), what is \(f(x+\varepsilon)\) ?

For real functions, continuity tells us that as long as \(\lvert \varepsilon \rvert < \delta\), the perturbed output \(f(x+\varepsilon)\) is \(\tilde{\varepsilon}\)-close to \(y\) for some small \(\tilde{\varepsilon} > 0\).

What is the equivalent bound on the error of the matrix function \(f_{\mathrm{X}}(\mathrm{A}) = \mathrm{X}\mathrm{A}\mathrm{X}^{-1}\) ?

Suppose \(\mathrm{A}\) is perturbed by a matrix \(\mathrm{E}\) with small magnitude: \(\lVert\mathrm{E}\rVert_\alpha < \delta\)

Then, the value of the function at this perturbed output is

$$

f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E}) = \mathrm{X}(\mathrm{A}+ \mathrm{E})\mathrm{X}^{-1} = f_{\mathrm{X}}(\mathrm{A}) + f_{\mathrm{X}}(\mathrm{E})

$$

thus, the norm of the error \(\mathrm{B}\, – f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E})\) is given by

$$

\lVert\mathrm{B}\,– f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E})\rVert_\alpha = \lVert\mathrm{X}\mathrm{E}\mathrm{X}^{-1}\rVert_\alpha \leq \kappa_\alpha(\mathrm{X})\,\lVert\mathrm{E}\rVert_\alpha

$$

therefore, the condition number \(\kappa_\alpha\) serves as a measure of continuity for the similarity operation \(f_\mathrm{X}(\cdot)\). If the condition number is small, the matrix is said to be “well-behaved”, so that for small input errors \(\mathrm{E}\) the output is only perturbed at most by the small amout \(\kappa_\alpha(\mathrm{X})\,\lVert\mathrm{E}\rVert_\alpha\).

If the condition number is large, even small input errors cause large deviations in the output.

## Orthogonal matrices are stable

The norm induced by setting \(\alpha = 2\) reveals that orthogonal matrices (the matrices \(\mathrm{Q}\) for which \(\mathrm{Q}^\mathsf{T}\mathrm{Q} = \mathrm{I}\)) are computationally very stable.

Indeed, the perturbation created by the similarity \(\mathrm{A}\mapsto \mathrm{Q}\mathrm{A}\mathrm{Q}^\mathsf{T}\) is equal to \(\lVert\mathrm{Q}\mathrm{E}\mathrm{Q}^\mathsf{T}\rVert_2 = \lVert\mathrm{E}\rVert_2\)

These transformations introduce **no extra perturbations** !

## Stability of a time-varying linear system

Consider the time-varying \(n\times n\) system \(\mathrm{A}(t)\, x(t) = b(t)\).

Here, the notion of “stable” is different: a time-varying linear system is unstable if small durations cause important variations in the state \(x\).

A stable system has therefore a small relative variation \(\lVert \dot{x} \rVert / \lVert x \rVert\).

Differentiating this equality yields

$$

\dot{x}(t) =\mathrm{A}(t)^{-1}\dot{b}(t)\, – \mathrm{A}(t)^{-1}\dot{\mathrm{A}}(t)\,x(t)

$$

Therefore the stability of the system can be upper-bounded (dropping the \(t\) for simplicity)

$$

\dfrac{\lVert\dot{x}\rVert}{\lVert x \rVert} \leq \kappa(\mathrm{A}) \left(\dfrac{\lVert \dot{b} \rVert}{\lVert \mathrm{A}\rVert \lVert x\rVert } + \dfrac{\lVert \dot{\mathrm{A}}\rVert}{\lVert \mathrm{A}\rVert}\right)

$$

therefore, a system whose matrix has a small condition number at time \(t\) will stay relatively close to its current position in the near future.