# Shae's Ramblings

Stuff I find slightly meaningful and close to achievable

## Condition number of a matrix: basics

A special kind of norm for $n\times n$ square matrices $\mathrm{A}$ is called the subordinate (or induced) norm.
If we have a norm $\lVert\cdot\rVert_\alpha$ on the input space, and a second norm $\lVert\cdot\rVert_\beta$ on the output space, the subordinate norm is

$$\lVert\mathrm{A}\rVert_{\alpha, \beta} = \sup_{x \neq 0}\dfrac{\;\lVert\mathrm{A}x\rVert_{\beta}}{\;\lVert x\rVert_\alpha}$$

## Submultiplicative property

An important property of these induced norms is that for any third norm $\lVert\cdot\rVert_{\gamma}$, we have:

$$\lVert\mathrm{AB}\rVert_{\alpha, \beta} \leq \lVert\mathrm{A}\rVert_{\gamma, \beta} \lVert\mathrm{B}\rVert_{\alpha, \gamma}$$

This is easily proven via the computations

$$\dfrac{\;\lVert\mathrm{ABx}\rVert_{\beta}}{\;\lVert x\rVert_\alpha} = \dfrac{\;\lVert\mathrm{A}y\rVert_{\beta}}{\;\lVert y\rVert_\gamma} \dfrac{\;\lVert\mathrm{B}x\rVert_{\gamma}}{\;\lVert x\rVert_\gamma}$$
for $y = \mathrm{B}x$.
As each supremum on the right is an upper bound, we get the inequality

$$\dfrac{\;\lVert\mathrm{ABx}\rVert_{\beta}}{\;\lVert x\rVert_\alpha} \leq \lVert\mathrm{A}\rVert_{\gamma, \beta} \lVert\mathrm{B}\rVert_{\alpha, \gamma}$$

we get the final equality because the supremum is the least upper bound.
A special case is when taking $\alpha = \beta$, which yields that
$$\lVert\mathrm{AB}\rVert_{\alpha, \alpha} \leq \lVert\mathrm{A}\rVert_{\alpha, \alpha} \lVert\mathrm{B}\rVert_{\alpha, \alpha}$$
we denote this norm $\lVert\cdot\rVert_\alpha$.

## Condition number

The condition number $\kappa_\alpha(\mathrm{A})$ of a matrix is the quantity
$$\kappa_\alpha(\mathrm{A}) = \lVert\mathrm{A}\rVert_\alpha\lVert\mathrm{A}^{-1}\rVert_\alpha$$
which is larger than $1$ because
$$1 = \lVert\mathrm{I}\rVert_\alpha = \lVert\mathrm{A}\mathrm{A}^{-1}\rVert_\alpha \leq \lVert\mathrm{A}\rVert_\alpha\lVert\mathrm{A}^{-1}\rVert_\alpha$$
this number is large when either $\mathrm{A}$ or its inverse creates large distortions as the input varies on the unit circle.

## Stability of a similarity

In linear algebra, a matrix $\mathrm{B}$ is said to be similar to $\mathrm{A}$ when there is an invertible transformation $\mathrm{X}$ such that $\mathrm{B} = \mathrm{X}\mathrm{A}\mathrm{X}^{-1}$.
In numerical analyses, it is often that we wish to know the robustness of an operation $f\colon x \mapsto f(x)$ with respect to a small perturbation of its input: if $y = f(x)$, what is $f(x+\varepsilon)$ ?
For real functions, continuity tells us that as long as $\lvert \varepsilon \rvert < \delta$, the perturbed output $f(x+\varepsilon)$ is $\tilde{\varepsilon}$-close to $y$ for some small $\tilde{\varepsilon} > 0$.
What is the equivalent bound on the error of the matrix function $f_{\mathrm{X}}(\mathrm{A}) = \mathrm{X}\mathrm{A}\mathrm{X}^{-1}$ ?
Suppose $\mathrm{A}$ is perturbed by a matrix $\mathrm{E}$ with small magnitude: $\lVert\mathrm{E}\rVert_\alpha < \delta$
Then, the value of the function at this perturbed output is

$$f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E}) = \mathrm{X}(\mathrm{A}+ \mathrm{E})\mathrm{X}^{-1} = f_{\mathrm{X}}(\mathrm{A}) + f_{\mathrm{X}}(\mathrm{E})$$

thus, the norm of the error $\mathrm{B}\, – f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E})$ is given by

$$\lVert\mathrm{B}\,– f_{\mathrm{X}}(\mathrm{A}+ \mathrm{E})\rVert_\alpha = \lVert\mathrm{X}\mathrm{E}\mathrm{X}^{-1}\rVert_\alpha \leq \kappa_\alpha(\mathrm{X})\,\lVert\mathrm{E}\rVert_\alpha$$

therefore, the condition number $\kappa_\alpha$ serves as a measure of continuity for the similarity operation $f_\mathrm{X}(\cdot)$. If the condition number is small, the matrix is said to be “well-behaved”, so that for small input errors $\mathrm{E}$ the output is only perturbed at most by the small amout $\kappa_\alpha(\mathrm{X})\,\lVert\mathrm{E}\rVert_\alpha$.
If the condition number is large, even small input errors cause large deviations in the output.

## Orthogonal matrices are stable

The norm induced by setting $\alpha = 2$ reveals that orthogonal matrices (the matrices $\mathrm{Q}$ for which $\mathrm{Q}^\mathsf{T}\mathrm{Q} = \mathrm{I}$) are computationally very stable.
Indeed, the perturbation created by the similarity $\mathrm{A}\mapsto \mathrm{Q}\mathrm{A}\mathrm{Q}^\mathsf{T}$ is equal to $\lVert\mathrm{Q}\mathrm{E}\mathrm{Q}^\mathsf{T}\rVert_2 = \lVert\mathrm{E}\rVert_2$

These transformations introduce no extra perturbations !

## Stability of a time-varying linear system

Consider the time-varying $n\times n$ system $\mathrm{A}(t)\, x(t) = b(t)$.
Here, the notion of “stable” is different: a time-varying linear system is unstable if small durations cause important variations in the state $x$.
A stable system has therefore a small relative variation $\lVert \dot{x} \rVert / \lVert x \rVert$.

Differentiating this equality yields

$$\dot{x}(t) =\mathrm{A}(t)^{-1}\dot{b}(t)\, – \mathrm{A}(t)^{-1}\dot{\mathrm{A}}(t)\,x(t)$$

Therefore the stability of the system can be upper-bounded (dropping the $t$ for simplicity)

$$\dfrac{\lVert\dot{x}\rVert}{\lVert x \rVert} \leq \kappa(\mathrm{A}) \left(\dfrac{\lVert \dot{b} \rVert}{\lVert \mathrm{A}\rVert \lVert x\rVert } + \dfrac{\lVert \dot{\mathrm{A}}\rVert}{\lVert \mathrm{A}\rVert}\right)$$

therefore, a system whose matrix has a small condition number at time $t$ will stay relatively close to its current position in the near future.