MathAlgebraLinear algebraVectors and vector spaces

Orthogonal and orthonormal bases

Provided by: Edvancium

4 minutes read

You already know that any two vectors not lying on the same line form a basis in the plane. However, if you ask a random person to draw two coordinate axes on a checkered piece of paper, they will likely draw two perpendicular lines. This is because the concept of orthogonality complements the idea of choosing a basis in a vector space quite elegantly. The combination of these concepts is highly fruitful and finds applications everywhere, from dimensionality reduction in machine learning and Fourier analysis to the frontiers of modern technology and science. The scope of applications for orthogonal bases is so vast that providing a comprehensive outline of its full extent is a challenging task. Considering this, let’s try to understand where it all comes from.

Example

Consider a two-dimensional Euclidean space $(V,\left\lang \cdot,\cdot\right\rang)$ . Choose arbitrary vectors $\vec{v}_{1}$ and $\vec{v}_{2}$ forming a basis of $V$ . A vector $\vec{v}'_{2} = \vec{v}_{2} - \frac{\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang}{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang}\cdot\vec{v}_{1}$ will now be of particular interest to us. Why? Because it happens to be perpendicular to $\vec{v}_{1}$ :

$\left\lang\vec{v}_{1},\vec{v}'_{2}\right\rang = \left\lang\vec{v}_{1},\vec{v}_{2} - \frac{\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang}{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang}\cdot\vec{v}_{1}\right\rang = \left\lang\vec{v}_{1},\vec{v}_{2}\right\rang - \frac{\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang}{\cancel{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang}}\cdot\cancel{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang} = 0$ And as $\left\{\vec{v}_{1},\vec{v}_{2}\right\}$ forms a basis $\vec{v}_{1}$ and $\vec{v}'_{2}$ are linearly independent. Therefore, $\left\{\vec{v}_{1},\vec{v}_{2}'\right\}$ is also a basis of $V$ . So here, you can construct a basis with orthogonal vectors for any two-dimensional space given an arbitrary basis in it! Such a basis is called an orthogonal basis.

For example, if $\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang = 9, \qquad \left\lang\vec{v}_{2},\vec{v}_{2}\right\rang = 8, \qquad \left\lang\vec{v}_{1},\vec{v}_{2}\right\rang = 6$ then $\vec{v}_{2}' = \vec{v}_{2} - \frac{\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang}{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang}\cdot\vec{v}_{1} = \vec{v}_{2} - \frac{2}{3}\vec{v}_{1}$ and $\{\vec{v}_{1},\vec{v}_{2} - \frac{2}{3}\vec{v}_{1}\}$ is an orthogonal basis (if you want, you can check it for this particular case). It is often useful to consider a basis with the lengths of all vectors equal to $1$ . Right now, vectors $\vec{v}_{1}$ and $\vec{v}_{2}'$ have lengths:

$\|\vec{v}_{1}\| = \sqrt{\left\lang\vec{v}_{1},\vec{v}_{1}\right\rang} = 3\\[3mm] \|\vec{v}_{2}'\| = \sqrt{\left\lang\vec{v}_{2}-\frac{2}{3}\vec{v}_{1},\vec{v}_{2} - \frac{2}{3}\vec{v}_{1}\right\rang} = \sqrt{\left\lang\vec{v}_{2},\vec{v}_{2}\right\rang - \frac{4}{3}\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang + \frac{4}{9} \left\lang\vec{v}_{1},\vec{v}_{1}\right\rang} = 2$ If you divide a vector by its length, you will end up with a vector co-directed with the initial one but with length $1$ . Such a process is called a normalization of a vector.

The result of the normalization of any vector $\vec{w}$ is always a unit vector:

$\left\|\frac{\vec{w}}{\|\vec{w}\|}\right\| = \left\|\frac{1}{\|\vec{w}\|}\cdot \vec{w}\right\| = \frac{1}{\cancel{\|\vec{w}\|}}\cdot \cancel{\|\vec{w}\|} = 1$

Let’s normalize vectors $\vec{v}_{1}$ and $\vec{v}_{2}'$ : $\vec{e}_{1} = \frac{\vec{v}_{1}}{\|\vec{v}_{1}\|} = \frac{1}{3}\vec{v}_{1}\\[3mm] \vec{e}_{2} = \frac{\vec{v}_{2}'}{\|\vec{v}_{2}'\|} = \frac{1}{2}\vec{v}_{2}' = \frac{1}{2}\vec{v}_{2} - \frac{1}{3}\vec{v}_{1}$ Vectors $\vec{e}_{1}$ and $\vec{e}_{2}$ have the same direction as $\vec{v}_{1}$ and $\vec{v}_{2}'$ (normalization does not change it), therefore $\{\vec{e}_{1},\vec{e}_{2}\}$ also forms a basis of $V$ , now it is orthogonal and with length-one-vectors:

$\left\lang\vec{e}_{1},\vec{e}_{1}\right\rang = \left\lang\vec{e}_{2},\vec{e}_{2}\right\rang = 1\\[3mm] \left\lang\vec{e}_{1},\vec{e}_{2}\right\rang = 0$ Lastly, notice that these relations are the same as relations for $\mathbb{R}^{2}$ with standard basis and usual dot product!

Geometric side of the story

All the algebraic manipulations introduced higher could seem a little overcomplicated. But if you look at them from a geometric point of view, it will all get more natural.

So let’s consider vectors $\vec{v}_{1}$ and $\vec{v}_{2}$ again, but now as arrows on a plain:

Two generic vectors on a plane.

Now notice that the vector $\frac{\left\lang\vec{v}_{1},\vec{v}_{2}\right\rang}{\left\lang\vec{v}_{1}, \vec{v}_{1}\right\rang}\cdot\vec{v}_{1}$ is just a projection

$\mathbf{proj}_{\vec{v}_{1}}(\vec{v}_{2})$ of vector $\vec{v}_{2}$ onto $\vec{v}_{1}$ . This projection can be easily illustrated, using this picture:

A projection of one vector onto another

Therefore, the vector $\vec{v}_{2}'$ is the vector $\vec{v}_{2}$ from which you ‘removed’ its projection onto $\vec{v}_{1}$ :

$\vec{v}_{2}' = \vec{v}_{2} - \mathbf{proj}_{\vec{v}_{1}}(\vec{v}_{2})$

You’ve already proven that $\vec{v}'_{2}$ is going to be perpendicular to $\vec{v}_{1}$ , however now it's also easy to see it:

Two vectors and the result of ‘removing’ a projection of second vector on the first from the second vector

Last, but not least, you normalize the vectors so that their lengths are equal to $1$ . In our picture, it looks like this.

The result of constructing two vectors of length 1 by shrinking the vectors of the basis

Here $\vec{e}_{1}$ and $\vec{e}_{2}$ are normalized versions of $\vec{v}_{1}$ and $\vec{v}_{2}'$ correspondingly. They form a basis on this plane, which is really similar to the one you usually choose in a checkered notebook:

An example of the way we usually draw a basis in a checkered notebook.

Higher dimensions

Now this idea of basis with length-one vectors, which are orthogonal to each other, could be adapted to an arbitrary dimension $n$ . First, let’s give it a name

Let $(V,\lang\cdot,\cdot\rang)$ be a Euclidean space ( $\dim(V) = n$ ). A basis $\{\vec{e}_{1},\vec{e}_{2},\dots\vec{e}_{n} \}$ is called orthogonal if $\left \lang \vec{e}_{i} , \vec{e}_{j} \right \rang = 0$ for any $i,j\in\{1,2,\dots,n\}$ , such that $i\ne j$ . It literally means that each vector of this basis is orthogonal to any other vector.

An orthonormal basis $\{\vec{e}_{1},\vec{e}_{2},\dots\vec{e}_{n} \}$ is called orthonormal if

$\left\lang\vec{e}_{i},\vec{e}_{i}\right\rang = 1$ for any $i \in \{1,2,\dots,n\}$ . This means, that besides orthogonality, each vector is a unit vector.

There is a very useful conventional symbol in mathematics, which is called Kronecker delta. It is defined in the following manner

$\delta_{i,j} = \begin{cases} 1, \quad \mathrm{if} \quad i = j\\ 0, \quad \mathrm{if} \quad i\ne j \end{cases}$ It means that $\delta_{4,4}$ and $\delta_{3,5}$ mean the same as $1$ and $0$ correspondingly. How could it be used? Well, you can say that a basis $\{\vec{e}_{1},\vec{e}_{2},\dots\vec{e}_{n} \}$ is orthonormal if

$\left\lang\vec{e}_{i},\vec{e}_{j}\right\rang = \delta_{i,j}$ for $i,j\in\{1,2,\dots,n\}$ . This way, you can combine the two above-mentioned definitions into one.

You will learn more on why orthonormal bases are so great later. But first, let's state that such a basis could be constructed for any finite-dimensional vector space. The process of this construction is called Gram-Schmidt process, and it goes like this:

Let’s start with $V$ (with inner product $\lang\cdot,\cdot\rang$ and $\dim(V) = n$ ) and its arbitrary basis $\{\vec{v}_{1},\vec{v}_{2},\dots,\vec{v}_{n}\}$ . Introduce a new basis $\{\vec{w}_{1},\vec{w}_{2},\dots,\vec{w}_{n}\}$ in the following manner.

Vector $\vec{w}_{1}$ is equal to $\vec{v}_{1}$ .
Each following vector $\vec{w}_{k}$ (for $1<k\le n$ ) is defined as $\vec{v}_{k}$ , from which you ‘removed’ all the projections on the previous vectors $\vec{v}_{1}$ , $\vec{v}_{2}$ , …, $\vec{v}_{k - 1}$ : $\vec{w}_{k} = \vec{v}_{k} - \mathbf{proj}_{\vec{w}_{1}}(\vec{v}_{k}) - \mathbf{proj}_{\vec{w}_{2}}(\vec{v}_{k}) - \dots - \mathbf{proj}_{\vec{w}_{k - 1}}(\vec{v}_{k})$ These vectors turn to be orthogonal to each other and form a basis of $V$ .
Finally, normalize vectors $\vec{w}_{k}$ : $\vec{e}_{k} = \frac{1}{\|\vec{w}_{k}\|}\cdot\vec{w}_{k}$ Vectors $\vec{e}_{k}$ are still orthogonal, but also have length $1$ .

Therefore, $\{\vec{e}_{1},\vec{e}_{2},\dots,\vec{e}_{n}\}$ is an orthonormal basis of $V$ .

The proof of orthogonality of $\{\vec{w}_{1},\vec{w}_{2},\dots,\vec{w}_{n}\}$ is actually quite a tricky task. Think about it like this: by ‘removing’ all the components of the vector that are co-directed with $\vec{v}_{1}$ , $\vec{v}_{2}$ , …, $\vec{v}_{k -1}$ you end up with a vector that is perpendicular to all of them.

Here’s an example for a better understanding. Consider three linearly independent vectors in a Euclidean space $\mathbb{R}^{3}$ with the choice of dot product as an inner product (meaning that $\lang\vec{v},\vec{w}\rang = \vec{v}\cdot\vec{w}$ ):

$\vec{v}_{1} = \begin{pmatrix}1&0&1\end{pmatrix}^{\mathsf{T}}\qquad \vec{v}_{2} = \begin{pmatrix}1&-2&0\end{pmatrix}^{\mathsf{T}}\qquad \vec{v}_{3} = \begin{pmatrix}1&-1&1\end{pmatrix}^{\mathsf{T}}$ Applying the Gram-Schmidt process:

$\vec{w}_{1} = \vec{v}_{1} = \begin{pmatrix}1 & 0 & 1\end{pmatrix}^{\mathsf{T}}, \qquad \vec{e}_{1} = \begin{pmatrix} \frac{1}{\sqrt{2}}& 0 & \frac{1}{\sqrt{2}}\end{pmatrix}^{\mathsf{T}}\\ \vec{w}_{2} = \vec{v}_{2} - \frac{\left\lang\vec{w}_{1},\vec{v}_{2}\right\rang}{\left\lang\vec{w}_{1},\vec{w}_{1}\right\rang}\cdot\vec{w}_{1} =\begin{pmatrix}\frac{1}{2} & - 2 & - \frac{1}{2} \end{pmatrix}^{\mathsf{T}}, \qquad \vec{e}_{2} = \begin{pmatrix}\frac{\sqrt{2}}{6}&-\frac{2\sqrt{2}}{3}&-\frac{\sqrt{2}}{6}\end{pmatrix}^{\mathsf{T}}\\ \vec{w}_{3} = \vec{v}_{3} - \frac{\left\lang\vec{w}_{2},\vec{v}_{3}\right\rang}{\left\lang\vec{w}_{2},\vec{w}_{2}\right\rang}\cdot\vec{w}_{2} - \frac{\left\lang\vec{w}_{1},\vec{v}_{3}\right\rang}{\left\lang\vec{w}_{1},\vec{w}_{1}\right\rang}\cdot\vec{w}_{1} = \begin{pmatrix}-\frac{2}{9}& -\frac{1}{9}& \frac{2}{9}\end{pmatrix}^{\mathsf{T}},\qquad \vec{e}_{3} = \begin{pmatrix}-\frac{2}{3}&-\frac{1}{3}&\frac{2}{3}\end{pmatrix}^{\mathsf{T}}$ By calculating the dot products, you can check that $\{\vec{e}_{1},\vec{e}_{2},\vec{e}_{3}\}$ is indeed an orthonormal basis of $\mathbb{R}^{3}$ .

Features of an orthonormal basis

The main feature of choice of an orthonormal basis in $V$ is that it turns $V$ into a $\mathbb{R}^{n}$ (here $n = \dim(V)$ ) and $\lang\cdot,\cdot\rang$ into a standard dot product. Let see how it works.

If $\{\vec{e}_{1},\vec{e}_{2},\dots,\vec{e}_{n}\}$ is an arbitrary basis of $V$ then you can expressthe inner product of vectors $\vec{a} = a_{1}\vec{e}_{1} + a_{2}\vec{e}_{2} + \dots + a_{n}\vec{e}_{n}$ and $\vec{b} = b_{1}\vec{e}_{1} + b_{2}\vec{e}_{2} + \dots + b_{n}\vec{e}_{n}$ only with the help of all possible products of form $\left\lang\vec{e}_{i},\vec{e}_{j}\right\rang$ where $i\le j$ :

$\left\lang\vec{a},\vec{b}\right\rang = \underbrace{a_{1}b_{1}\left\lang\vec{e}_{1},\vec{e}_{1}\right\rang + a_{2}b_{2}\left\lang\vec{e}_{2},\vec{e}_{2}\right\rang + \dots + a_{1}b_{1}\left\lang\vec{e}_{n},\vec{e}_{n}\right\rang}_{i = j} + \\ + \underbrace{(a_{1}b_{2} + a_{2}b_{1})\left\lang\vec{e}_{1},\vec{e}_{2}\right\rang + (a_{1}b_{3} + a_{3}b_{1})\left\lang\vec{e}_{1},\vec{e}_{3}\right\rang + \dots + (a_{n-1}b_{n} + a_{n}b_{n -1})\left\lang\vec{e}_{n-1},\vec{e}_{n}\right\rang}_{i<j}$ This is one monstrous expression. The problem with it is that starting with $n = 4$ the number of terms with $i<j$ is greater than with $i = j$ and it grows in quadratic manner with the increase of $n$ . That means that computations of inner products in an arbitrary basis are much harder than in $\mathbb{R}^{n}$ with standard dot product, in which the number of terms is just $n$ .

But notice that if $\{\vec{e}_{1},\vec{e}_{2},\dots,\vec{e}_{n}\}$ is an orthonormal basis, this problem vanishes! Why? Because all the terms with $i < j$ in this enormous formula will be equal to zero. Moreover, all the other terms will be equal to $1$ ! Which results in a very elegant formula

$\left\langle\vec{a},\vec{b}\right\rangle = a_{1}b_{1} + a_{2}b_{2} + \dots +a_{n}b_{n}$ Not only this formula is way easier, but it is also very familiar! It is just a dot product of vectors $\begin{pmatrix}a_{1} & a_{2} & \dots & a_{n}\end{pmatrix}^{\mathsf{T}}\in\mathbb{R}^{n}$ and $\begin{pmatrix} b_{1} & b_{2} & \dots & b_{n} \end{pmatrix}^{\mathsf{T}}\in\mathbb{R}^{n}$ . Take into account that writing down

$\begin{pmatrix} a_{1}\\ a_{2}\\ \vdots\\ a_{n} \end{pmatrix} + \begin{pmatrix} b_{1}\\ b_{2}\\ \vdots\\ b_{n} \end{pmatrix} = \begin{pmatrix} a_{1} + b _{1}\\ a_{2} + b_{2}\\ \vdots\\ a_{n} + b_{n} \end{pmatrix}; \qquad \lambda \begin{pmatrix} a_{1}\\ a_{2}\\ \vdots\\ a_{n} \end{pmatrix} = \begin{pmatrix} \lambda a_{1}\\ \lambda a_{2}\\ \vdots\\ \lambda a_{n} \end{pmatrix}$ is just another way of writing $(a_{1}\vec{e}_{1} + a_{2}\vec{e}_{2} + \dots a_{n}\vec{e}_{n}) + (b_{1}\vec{e}_{1} + b_{2}\vec{e}_{2} + \dots b_{n}\vec{e}_{n}) = ((a_{1}+b_{1})\vec{e}_{1} + (a_{2} + b_{2})\vec{e}_{2} + \dots (a_{n} + b_{n})\vec{e}_{n})\\ \lambda (a_{1}\vec{e}_{1} + a_{2}\vec{e}_{2} + \dots a_{n}\vec{e}_{n}) = \lambda a_{1}\vec{e}_{1} + \lambda a_{2}\vec{e}_{2} + \dots \lambda a_{n}\vec{e}_{n}$ You can conclude, that writing down vectors in an orthogonal basis of a Euclidean space $V$ is basically the same as working with $\mathbb{R}^{n}$ with a standard dot product!

There is one more way to express this property. Let’s $\vec{x}$ be a vector in a Euclidean space $(V,\lang\cdot,\cdot\rang)$ . Let $\{\vec{e}_{1},\dots,\vec{e}_{n}\}$ be an orthonormal basis of this space and $\vec{x} = x_{1}\vec{e}_{1} + \dots + x_{n}\vec{e}_{n}$ . Now let’s calculate the following sum

$\sum_{i = 1}^{n}\lang\vec{x},\vec{e}_{i}\rang\cdot\vec{e}_{i} = \lang\vec{x},\vec{e}_{1}\rang\cdot\vec{e}_{1} + \dots + \lang\vec{x},\vec{e}_{n}\rang\cdot\vec{e}_{n} = \\ = \lang x_{1}\vec{e}_{1} + \dots + x_{n}\vec{e}_{n},\vec{e}_{1}\rang\cdot\vec{e}_{1} + \dots + \lang x_{1}\vec{e}_{1} + \dots + x_{n}\vec{e}_{n},\vec{e}_{n}\rang\cdot\vec{e}_{n}$ If you throw out all the inner products that are equal to $0$ you will end up with

$\lang\vec{x},\vec{e}_{1}\rang\cdot\vec{e}_{1} + \dots + \lang\vec{x},\vec{e}_{n}\rang\cdot\vec{e}_{n} = x_{1}\lang\vec{e}_{1},\vec{e}_{1}\rang \cdot\vec{e}_{1} + \dots + x_{n}\lang\vec{e}_{n},\vec{e}_{n}\rang\cdot\vec{e}_{n} = x_{1}\vec{e}_{1} + \dots + x_{n}\vec{e}_{n}$ This literally means that $\sum_{i = 1}^{n}\lang\vec{x},\vec{e}_{i}\rang\cdot\vec{e}_{i} = \lang\vec{x},\vec{e}_{1}\rang\cdot\vec{e}_{1} + \dots + \lang\vec{x},\vec{e}_{n}\rang\cdot\vec{e}_{n} = \vec{x}$ Therefore the coordinates $\vec{x}_{i}$ are just the projections of $\vec{x}$ onto $\vec{e}_{i}$ .

Conclusion

Let $\left(V,\lang\cdot,\cdot\rang\right)$ be a Euclidean space and $\dim(V) = n$ .

A normalization of a vector $\vec{v}$ is a unit vector $\frac{1}{\|\vec{v}\|}\cdot \vec{v}$ .
A basis $\{\vec{e}_{1},\vec{e}_{2},\dots,\vec{e}_{n}\}$ is orthogonal if $\lang\vec{e}_{i},\vec{e}_{j}\rang = 0$ for all $i\ne j$ .
A basis $\{\vec{e}_{1},\vec{e}_{2},\dots,\vec{e}_{n}\}$ is orthonormal if $\lang\vec{e}_{i},\vec{e}_{j}\rang = \delta_{i,j}$
If you write the vectors of $V$ in an orthogonal basis, then $V$ could be thought as $\mathbb{R}^{n}$ and $\lang\cdot,\cdot\rang$ as the dot product.
Knowing some base of a Euclidean space you can always obtain an orthonormal basis of this space by the Dram-Schmidt process.
The following identity holds: $\vec{x} = \lang\vec{x},\vec{e}_{1}\rang\cdot\vec{e}_{1} + \dots \lang\vec{x},\vec{e}_{n}\rang\cdot\vec{e}_{n}$ .

3 learners liked this piece of theory. 0 didn't like it. What about you?

Report a typo

Orthogonal and orthonormal bases

Example

Geometric side of the story

Features of an orthonormal basis

Conclusion

Related topics