MathAlgebraLinear algebraLinear operators

Linear operators and matrices

9 minutes read

You will finally discover the intimate connection between matrices and linear transformations, which is fundamental to understanding both. With it, you'll be able to easily analyze a system of equations completely with just a couple of calculations!
Now you will begin to reap the fruits of your labor. You'll put to work practically everything you've learned so far. This topic will even be much easier than the previous ones: you have already done all the hard work, it's time to enjoy!

The transformation generated by a matrix

Recent topics insisted that any m×nm \times n matrix AA induces a linear transformation LAL_A thanks to the properties of the matrix product:

LA:RnRm, given by LA(v)=Av with v=v1e1++vnenL_A: \, \mathbb{R}^n \rightarrow \mathbb{R}^m, \text{\, given by \,} L_A(v)=Av \text{\, with \,} v=v_1 e_1 + \dots + v_n e_n

Furthermore, the null space and the range of LAL_A tell us several properties of the system of AA. But let's stop for a minute to improve our notation. Although it's important to refer to the individual entries of the matrix, from now on, the main focus will be its columns. Let's denote them as A1,,AnA_1, \dots, A_n and write the matrix in terms of them as follows:

A=[A1An]A = \left[A_1\, |\, \dots |\, A_n \, \right]Clearly, the columns A1,,AnA_1, \dots, A_n are vectors in Rm\mathbb{R}^m. Here comes a little surprise. Do you remember that any linear transformation is completely determined by its values on a basis? Let's see what this looks like for LAL_A. Let's only transform the first vector of the canonical basis:

LA(e1)=Ae1=A1L_A (e_1) = A \, e_1 = A_1

Proof

By the definition of matrix product:

Ae1=[(a11,,a1n)e1(am1,,amn)e1]=[a11am1]=A1A \, e_1 = \begin{bmatrix} (a_{11}, \dots, a_{1n}) \cdot \, e_1 \\ \vdots \\ (a_{m1}, \dots, a_{mn}) \cdot \, e_1 \end{bmatrix} = \begin{bmatrix} a_{11}\\ \vdots \\ a_{m1}\end{bmatrix} = A_1

By transforming e1e_1, you recover the first column of AA! In fact, the same happens with the other vectors of the basis, and this is a key point:

The values of LAL_A in the basis are the columns of AA

As a direct consequence, you can reduce the product AvAv to a linear combination:

LA(v)=Av=v1A1++vnAnRmL_A(v) = Av = v_1 A_1 + \dots + v_n A_n \in \mathbb{R}^m

The matrix of a transformation

There's something suspicious here. You already know that all the information of an arbitrary linear transformation T:RnRmT: \mathbb{R}^n \rightarrow \mathbb{R}^m is contained in its values in the basis:

T(v)=v1T(e1)++vnT(en)T(v) = v_1 T(e_1) + \dots +v_nT(e_n)So if you saved those values {T(e1),,T(en)}\{T(e_1), \dots, T(e_n) \} somewhere you could rebuild the whole transformation without any problem. But look closely at the last equation... doesn't it look very similar to the last equation in the previous section, where you saw how to calculate the transformation of a matrix?

It seems that you can no longer avoid the fact that the place where you must store the values of the transformation is a matrix!

[T]=[T(e1)T(en)][T] = \left[ T(e_1) \, |\, \dots |\, T(e_n) \, \right]And [T][T] is called the matrix associated with TT. This means that:

T(v)=v1T(e1)++vnT(en)=[T]v=L[T](v)T(v) = v_1 T(e_1) + \dots +v_nT(e_n) = [T] v = L_{[T]} (v)Thus, to fully know all the possible values of TT, it is enough to just calculate their values in the basis and store them in a matrix!

In summary, you can think of every m×nm \times n matrix as if it were a linear transformation that deforms Rn\mathbb{R}^n into Rm\mathbb{R}^m and, conversely, every linear transformation between these spaces is completely determined by an m×nm \times n matrix.

Calculating the matrix of a transformation

Before you take a closer look at the relationship we just discovered, let's get our hands dirty with some examples.

Let's start with the following transformation:

T:R2R3T(xy)=(xy3x+2yy)T: \mathbb{R}^2 \rightarrow \mathbb{R}^3 \qquad T\begin{pmatrix}x \\ y \end{pmatrix} = \begin{pmatrix}x-y \\ 3x +2y \\ y \end{pmatrix}The first thing you have to do is calculate its values in the basis:T(e1)=(10)=(130)T(e2)=(01)=(121)\begin{align*} T(e_1)&=\begin{pmatrix}1 \\ 0 \end{pmatrix} = \begin{pmatrix}1 \\ 3 \\ 0 \end{pmatrix} \\ T(e_2) &=\begin{pmatrix}0 \\ 1 \end{pmatrix} = \begin{pmatrix} -1 \\ 2 \\ 1 \end{pmatrix} \end{align*}

Finally, use these values for the columns of the matrix of TT:

[T]=[T(e1)T(e2)]=[113201][T] = \left[ T(e_1) \, |\, T(e_2) \, \right] = \begin{bmatrix} 1&-1 \\ 3&2 \\ 0&1 \end{bmatrix}Suppose now that you need to solve such a system of linear equations [T] x=b, xR2,bR3[T]\ x = b, \ x \in \mathbb{R}^2, b \in \mathbb{R}^3. The columns of [T][T] are linearly independent (check it!), so the range of TT is a plane. By the dimension theorem, the null space of TT only contains 00 and this, in turn, means that TT is injective. In consequence:

  1. [T] x=b[T]\ x = b doesn't always have a solution. For instance, if b=(0,0,0)Tb=(0,0,0)^T there aren't any solutions.
  2. When [T] x=b[T]\ x = b has a solution, it's unique. For example, if b=(0,5,1)Tb=(0,5,1)^T the unique solution is (1,1)T(1,1) ^T

You've just derived a couple of properties of TT and proved that a particular system of equations doesn't always have a solution just by building the matrix of TT. Great, don't you think?

The range of T is a plane

Now let's take the following operator:

T:R3R3T(xyz)=(x+y+z2y3zz)T: \mathbb{R}^3 \rightarrow \mathbb{R}^3 \qquad T\begin{pmatrix}x \\ y\\z \end{pmatrix} = \begin{pmatrix} x+y+z \\ 2y-3z \\ z \end{pmatrix}Transforming the basis:

T(e1)=(100)=(100)T(e2)=(010)=(120)T(e3)=(001)=(131)\begin{align*} T(e_1) &=\begin{pmatrix}1 \\ 0 \\0 \end{pmatrix} = \begin{pmatrix}1 \\ 0 \\ 0 \end{pmatrix} \\ T(e_2) &=\begin{pmatrix}0 \\ 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \\ 0 \end{pmatrix} \\ T(e_3) &=\begin{pmatrix}0 \\ 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ -3 \\ 1 \end{pmatrix} \end{align*}Thus the matrix of TT is:

[T]=[T(e1)T(e2)T(e3)]=[111023001][T] = \left[ T(e_1) \, |\, T(e_2) |\, T(e_3) \, \right] = \begin{bmatrix} 1&1&1 \\0 & 2&-3 \\ 0 & 0 &1 \end{bmatrix}

Now let's look at the operator. It is clear from the columns that its range is all R3\mathbb{R}^3. This means that the system [T]x=b[T]\,x=b has a solution for any bR3b \in \mathbb{R}^3. By the dimension theorem, the null space of TT is just {0}\{ 0\}, in consequence TT is injective. But as it's an operator, it turns out that it's also bijective! Thus, the system [T]x=b[T]\,x=b always has a unique solution.

Two sides of the same coin

Hopefully, the above examples have given you an idea of the power of the relationship between linear transformations and matrices. Consider the set of all linear transformations from Rn\mathbb{R}^n into Rm\mathbb{R}^m, denoted by L(Rn,Rm)L(\mathbb{R}^n, \mathbb{R}^m), and the set of m×nm \times n matrices Mm×nM_{m\times n}.

To begin with, you know that they are vector spaces. But the interesting thing is that now you can connect them with a function! It is natural

that this function would be the one that assigns its matrix to each linear transformation. It is quite easy to show that:

[T+S]=[T]+[S]  and [λT]=λ[T][T +S] = [T] + [S] \text{ \, and \,} [\lambda T] = \lambda [T]

Proof (T+S)(v)=v1(T+S)(e1)++vn(T+S)(en)=v1T(e1)+v1S(e1)++vnT(en)+vnS(en)=v1T(e1)++vnT(en)+v1S(e1)++vnS(en)=[T]v+[S]v=([T]+[S])v\begin{align*} (T+S)(v) &= v_1(T+S)(e_1) + \dots +v_n(T+S)(e_n) \\ &= v_1T(e_1) + v_1S(e_1) + \dots + v_nT(e_n) + v_n S(e_n)\\ &= v_1T(e_1) + \dots + v_nT(e_n) + v_1S(e_1) + \dots + v_n S(e_n)\\ &= [T]v + [S]v \\ &= ([T] + [S])v \end{align*}

(λT)(v)=v1(λT)(e1)++vn(λT)(en)=λ(v1T(e1)++vnT(en))=λ([T]v)=(λ[T])v\begin{align*} (\lambda T)(v) &= v_1(\lambda T)(e_1) + \dots +v_n(\lambda T)(e_n) \\ &= \lambda \left(v_1T(e_1) + \dots +v_nT(e_n) \right)\\ &= \lambda ([T]v) \\ &= (\lambda [T])v \end{align*}

You have just built a linear transformation between L(Rn,Rm)L(\mathbb{R}^n, \mathbb{R}^m) and Mm×nM_{m\times n}! Fortunately, this function is injective and surjective, which implies that it is an isomorphism! In particular, the dimension theorem tells you that both spaces have the same dimension, mnm \, n.

Proof
  • Injective: The constant transformation 00 sends the entire basis to the vector 00, so its matrix is the zero matrix. But if another transformation had that matrix, then it would send the entire base to 00 and that would mean that it would coincide with the constant transformation 00 in the base and therefore they would be equal.
  • Surjective: Given any matrix, we can construct a linear transformation simply indicating that it sends the first vector of the base to the first column of the matrix, the second vector of the base to the second column of the matrix, and so on. Then the matrix associated with that transformation would be the original matrix.

This means that these sets are essentially the same, you can think of them as two ways of thinking of the same object. Since they are almost the same, you can study the properties of one through those of the other. Sometimes a result is easier to understand for matrices and then it is also true for transformations and vice versa.

Have you ever wondered why the matrix product has such a strange definition? Well, it turns out that it is made to mimic the behavior of the composition of linear transformations, but with matrices. If, after all, both sets are almost the same, it is not crazy to think that operations defined in one have an analog in the other. This very useful connection is reflected in the following result:

If S:RmRnS: \, \mathbb{R}^m \rightarrow \mathbb{R}^n and T:RnRpT: \, \mathbb{R}^n \rightarrow \mathbb{R}^p are linear, then:

[TS]=[T][S][ T \circ S ] = [T] \, [S]

Conclusion

Let L(Rn,Rm)L(\mathbb{R}^n, \mathbb{R}^m) be the set of all linear transformations from Rn\mathbb{R}^n into Rm\mathbb{R}^m, and the set of m×nm \times n matrices Mm×nM_{m\times n}.

Let TL(Rn,Rm)T \in L(\mathbb{R}^n, \mathbb{R}^m) and AMm×nA \in M_{m\times n}.

  • The values of LAL_A in the basis are the columns of AA

  • LA(v)=Av=v1A1++vnAnL_A(v) = Av = v_1 A_1 + \dots + v_n A_n

  • [T]=[T(e1)T(en)][T] = \left[ T(e_1) \, |\, \dots |\, T(e_n) \, \right]

  • [T+S]=[T]+[S]  and [λT]=λ[T][T +S] = [T] + [S] \text{ \, and \,} [\lambda T] = \lambda [T]

  • There is an isomorphism between L(Rn,Rm)L(\mathbb{R}^n, \mathbb{R}^m) and Mm×nM_{m\times n}. You can think of linear transformations and matrices as the same set but seen in different ways.

  • If S:RmRnS: \, \mathbb{R}^m \rightarrow \mathbb{R}^n and T:RnRpT: \, \mathbb{R}^n \rightarrow \mathbb{R}^p are linear, then [TS]=[T][S][ T \circ S ] = [T] \, [S]

4 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo