8 minutes read

Now that you know what linear transformations are, you're ready to delve deeper into their properties. Now, we're going to introduce two new sets intimately related to them: the null space and the image. Both arise naturally in the context of systems of equations.

Thanks to them, you will be able to reinterpret these systems, discover several relevant facts about them, and even analyze them geometrically. Here, you'll work with a linear transformation TT between two vector spaces VV and WW.

Null space

Let's look at an m×nm\times n homogeneous system of linear equations. You already know that it can be represented as a matrix equation:

Ax=0Ax=0But if you remember the linear transformation LAL_A associated with AA, which is given by LA(x)=AxL_A(x)=A x, then you can write the equation:

LA(x)=0L_A(x)=0So, the solutions to Ax=0Ax=0 are the vectors that LAL_A maps to 00. Well, it turns out that for any linear transformation TT, this kind of vectors make up an important set, which is called the null space of TT (or kernel of TT):

ker(T)={vV:T(v)=0}\ker(T) = \{v \in V: T(v) =0\}

The null space is full of surprises that you will discover now and in future topics. The first one is that it's not a simple set, but a subspace. In fact, it's quite straightforward to check this out.

Proof that ker(T)\ker(T) is a subspace

If v,uker(T)v, u \in \ker(T), then T(v)=T(u)=0T(v) = T(u) =0. It follows that T(u+v)=T(u)+T(v)=0+0=0T(u+v) = T(u) +T(v) = 0 +0=0, and by definition u+vker(T)u+v \in \ker(T). Similarly, T(λv)=λT(v)=λ0=0T(\lambda v) = \lambda T(v) = \lambda 0 =0. \blacksquare

Going back to the systems of linear equations, you've just shown that the solutions of the homogeneous system Ax=0Ax = 0 are the null space of LAL_A! Now, let's say that:

A=(62815520)A = \begin{pmatrix} 6&2&8 \\ 15&5&20 \end{pmatrix}

You can find the solutions to the system Ax=0Ax = 0 by computing a row-echelon form of AA. Then you'll find out that:ker(LA)={λ(1,3,0)+μ(4,0,3):λ,μR}\ker(L_A)=\{\lambda (-1,3,0) + \mu (-4, 0,3) : \, \lambda, \mu \in \mathbb{R}\}This means that (1,3,0)T(-1,3,0)^T and (4,0,3)T(-4, 0,3)^T are a basis, thus the null space of LAL_A is a plane.

The null space of T is a plane

Null spaces properties

Transformations that send different vectors to different vectors are called injective. This means that the unique vector that you can map to 00 is indeed 00. In consequence ker(T)={0}\ker(T) = \{ 0 \}. So, injective transformations have the smallest possible null space.

It seems intuitive that the more vectors TT maps to 00, the less injective it is. And this suggests that when the null space is as small as possible, TT is, in fact, injective. Well, the intuition is correct, and this is the second surprise of the null space:

TT is injective if and only if ker(T)={0}\ker(T) =\{0\}

Proof

We have already shown that if TT is injective its null space only contains 00.

Conversely, if T(v)=T(u)T(v) = T(u), then T(v)T(u)=0T(v) - T(u) = 0, from which we immediately obtain that T(uv)=0T(u-v) = 0. By definition this means that uvu-v is in the null space. But by hypothesis the unique vector that is in the null space is 00, so that uv=0u-v=0. Clearly this implies that u=vu=v. \blacksquare

This result means that, in order to prove that TT is injective, it's enough to show that T(v)=0T(v)=0 implies that v=0v=0.

Do you remember that linear transformations preserve the structure of vector spaces? Well, the third surprise is that transformations whose null space is as small as possible preserve linear independence. More precisely, if v1,,vkv_1, \dots, v_k are linearly independent and TT is injective, then T(v1),,T(vk)T(v_1), \dots, T(v_k) are linearly independent..

Proof

If v1,,vkv_1, \dots, v_k are linearly independent, then let's consider λ1T(v1)++λkT(vk)=0\lambda_1 T(v_1) + \dots + \lambda_k T(v_k) =0. This means that T(λ1v1++λkvk)=0T(\lambda_1 v_1 + \dots + \lambda_k v_k) =0, but as TT is injective, from the previous result we get that λ1v1++λkvk=0\lambda_1 v_1 + \dots + \lambda_k v_k = 0. But as these vectors are linearly independent, we know that λ1==λk=0\lambda_1 = \dots = \lambda_k =0. So T(v1),,T(vk)T(v_1), \dots, T(v_k) are linearly independent. \blacksquare

Range

Let's go back to systems of equations. When they are not homogeneous they look like this:

Ax=bAx=bBut that's exactly LA(x)=bL_A(x) =b. This means that it must be a vector xx whose image under TT is bb. In general, if you put together the image under TT of all the vectors of VV, then the result is a set called the range of TT (or image of TT):

Im(T)={T(v): vV}\mathrm{Im}(T) = \{ T(v) : \ v \in V \}The range of TT has a close relationship to its kernel. Both possess similar properties and actually complement each other.

To begin with, while the null space contains vectors of VV, the range is a subset of WW. And like the null space, it is a subspace.

Proof

If v,uVv, u \in V, then T(v),T(u)Im(T)T(v), T(u) \in \mathrm{Im}(T). Clearly T(v)+T(u)=T(v+u)T(v)+T(u) = T(v+u), and as v+uVv + u \in V, we get that T(v)+T(u)Im(T)T(v) + T(u) \in \mathrm{Im}(T). Similarly, as λvV\lambda v\in V it's immediate that λT(v)=T(λv)range(T)\lambda T(v) = T(\lambda v) \in range(T). \blacksquare

If, for example, TT is a transformation from R4\mathbb{R}^4 into R3\mathbb{R}^3, then this means that it collapses all R4\mathbb{R}^4 into a point, a line, a plane or even all R3\mathbb{R}^3.

Now, the system Ax=bAx=b has a solution if LA(x)=bL_A(x) =b which now means that bIm(LA)b \in \mathrm{Im}(L_A). Therefore, the fact that the system of equations Ax=bAx=b has a solution is the same that bb is in Im(LA)\mathrm{Im}(L_A)! Let's continue with the matrix from the first example.

LA(x)=xLA(e1)+yLA(e2)+zLA(e3)=xAe1+yAe2+zAe3=x(615)+y(25)+z(820)=3x(25)+y(25)+4z(25)=(3x+y+4z)(25)\begin{align*} L_A(x) &= xL_A(e_1) + y L_A(e_2 ) + zL_A(e_3) = xAe_1 +yAe_2 + zAe_3 \\ &=x \begin{pmatrix} 6 \\15 \end{pmatrix} +y \begin{pmatrix} 2 \\ 5 \end{pmatrix} +z \begin{pmatrix} 8 \\ 20 \end{pmatrix}\\ &=3x \begin{pmatrix} 2 \\5 \end{pmatrix} +y \begin{pmatrix} 2 \\ 5 \end{pmatrix} +4z \begin{pmatrix} 2 \\ 5 \end{pmatrix}\\ &= (3x+y+4z)\begin{pmatrix} 2 \\ 5 \end{pmatrix} \end{align*}

Then (2,5)T(2, 5)^T generates the range of LAL_A, and so it is a line:

The range of T is a line

Range properties

Transformations that occupy the entire W space are called surjective. This means that Im(T)=W\mathrm{Im}(T) = W. In other words, surjective transformations have the biggest possible range.

Linear transformations "preserve" spanning sets:

If span(v1,v2,,vn)=V\mathrm{span}(v_1, v_2, \dots, v_n) = V then span(T(v1),T(v2),,T(vn))=Im(T)\mathrm{span}(T(v_1), T(v_2), \dots, T(v_n)) = \mathrm{Im}(T)

Proof

If T(v)Im(T)T(v) \in \mathrm{Im}(T), then vVv \in V and by hypothesis v=λ1v1+λ2v2++λnvnv = \lambda_1 v_1 + \lambda_2 v_2 + \dots + \lambda_n v_n. Applying TT we get that T(v)=T(λ1v1+λ2v2++λnvn)=λ1T(v1)+λ2T(v2)++λnT(vn),T(v) = T(\lambda_1 v_1 + \lambda_2 v_2 + \dots + \lambda_n v_n) = \lambda_1 T(v_1) + \lambda_2 T(v_2) + \dots + \lambda_n T(v_n),so T(v)span(T(v1),T(v2),,T(vn))T(v) \in \mathrm{span}(T(v_1), T(v_2), \dots, T(v_n))

You can use this result to calculate the rank of a matrix. Since LA(ei)=A(ei)L_A(e_i) = A(e_i) is the ith column of AA, it follows immediately that the range of LAL_A is the space spanned by the columns of AA. The dimension of Im(LA)\mathrm{Im}(L_A) is known as the rank of AA.

In summary, injective transformations have the smallest null space and preserve linear independence. On the other hand, surjective transformations have the largest range and preserve the spanning sets.

Putting it all together, bijective transformations (injective and surjective simultaneously) preserve linearly independent spanning sets, in other words, they convert bases of V to bases of W! They are so important that they have their own name, isomorphisms.

Equilibrium

Up to this point, it seems pretty clear that the kernel and image tell us several things about the behavior of T and have similar properties.

However, their connection is so intimate that they balance each other: the smaller one is, the bigger the other. More precisely, as ker(T)\ker(T) is a subspace of VV, it's clear that dimker(T)dimV\dim \ker(T) \leq \dim V. Well, it turns out that dimIm(T)\dim \mathrm{Im}(T) is the number that complements dimker(T)\dim \ker(T) in the sense that it is the exact amount that is missing to reach dimV\dim V, in the sense that dimker(T)+dimIm(T)=dimV\dim \, \ker(T) + \dim \, \mathrm{Im}(T) = \dim \, V.

This is precisely the meaning of the last theorem, which is both a theoretical and a practical tool that you will use frequently in the future. It receives a name that is not at all modest: the dimension theorem (also known as the fundamental theorem of linear transformations).

dimV=dimker(T)+dimIm(T)\dim \, V = \dim \, \ker(T) + \dim \, \mathrm{Im}(T)

The dimension problem is combines the dimension of the null space with the dimension of the range

Proof

Let {v1,,vm}\{v_1, \dots, v_m \} be a basis for null(T)null(T). We can extend it to a basis of VV by adding more vectors {u1,,un}\{u_1, \dots, u_n \}. Then dimker(T)=m\dim \, \ker(T) = m and dimV=m+n\dim \, V = m +n. What we're going to proof is that {T(u1),,T(un)}\{T(u_1) , \dots, T(u_n) \} is a basis for Im(T)\mathrm{Im}(T).

First, it's a spanning set because if vVv \in V, then there are λ1,,λm,μ1,,μnR\lambda_1, \dots, \lambda_m, \mu_1, \dots, \mu_n \in \mathbb{R} such that:

v=λ1v1++λnvn+μ1u1++μnunv = \lambda_1v_1 + \dots + \lambda_n v_n + \mu_1 u_1 + \dots + \mu_n u_nBy applying TT we get that:T(v)=λ1T(v1)++λnT(vn)+μ1T(u1)++μnT(un)=λ10++λn0+μ1T(u1)++μnT(un)=0++0+μ1T(u1)++μnT(un)=μ1T(u1)++μnT(un)\begin{align*} T(v) &= \lambda_1 T(v_1) + \dots + \lambda_n T(v_n) + \mu_1 T(u_1) + \dots + \mu_n T(u_n) \\ & = \lambda_1 0 + \dots + \lambda_n 0 + \mu_1 T(u_1) + \dots + \mu_n T(u_n) \\ &= 0 + \dots + 0 + \mu_1 T(u_1) + \dots + \mu_n T(u_n)\\ &= \mu_1 T(u_1) + \dots + \mu_n T(u_n) \end{align*}

Finally, we will verify that T(u1),,T(un)T(u_1), \dots, T(u_n) are linearly independent. If μ1T(u1)++μnT(un)=0\mu_1 T(u_1) + \dots + \mu_n T(u_n) = 0, thenT(μ1u1++μnun)=0T(\mu_1 u_1 + \dots + \mu_n u_n) = 0. This means that μ1u1++μnunker(T)\mu_1 u_1 + \dots + \mu_n u_n \in \ker(T) and as {v1,,vm}\{v_1, \dots, v_m \} is a basis for ker(T)\ker(T) we get that:μ1u1++μnun=λ1v1++λmvm\mu_1 u_1 + \dots + \mu_n u_n = \lambda_1 v_1 + \dots + \lambda_m v_mBut this implies that μ1u1++μnunλ1v1λmvm=0\mu_1 u_1 + \dots + \mu_n u_n -\lambda_1 v_1 - \dots - \lambda_m v_m =0. And as {v1,,vm,u1,,un}\{v_1, \dots, v_m, u_1, \dots, u_n \}is a basis, we get that μ1==μn=0\mu_1 = \dots = \mu_n =0.

As a first sample of the power of this theorem, you can show really easily that when dim(V)=dim(W)\dim(V)=\dim(W) it is equivalent that TT is injective, surjective and bijective!

Here are a couple of applications to systems of linear equations. It is usually quite easy to know the null space of LAL_A, so from the Dimension Theorem, you can quickly calculate the rank of AA and then determine if it is a line, a plane, or something else.

On the other hand, when the matrix is squared of n×nn\times n, you can interpret its rank as a measure of how likely it is to find unique solutions. The larger the rank, the closer Im(LA)\mathrm{Im}(L_A) is to Rn\mathbb{R}^n, and when rank=nrank=n we get Im(LA)=Rn\mathrm{Im}(L_A) =\mathbb{R}^n. From the last theorem, you know that this means that LAL_A is bijective, so LA(x)=bL_A(x)=b is satisfied by a unique xx. This simplifies to the fact that the system Ax=bAx=b has a solution and that it is also unique.

Conclusion

  • The null space of TT is ker(T)={vV:T(v)=0}\ker(T) = \{v \in V: T(v) =0\}.
  • The solutions of a homogeneous system of linear equations are the null space of the matrix of the system.
  • TT is injective if and only if ker(T)={0}\ker(T)=\{0\}.
  • The range of TT is Im(T)={T(v): vV}\mathrm{Im}(T) = \{ T(v) : \ v \in V \}.
  • An inhomogeneous system of linear equations Ax=bAx=b has a solution if and only if bIm(LA)b \in \mathrm{Im}(L_A).
  • TT is surjective if and only if Im(T)=W\mathrm{Im}(T) = W .

  • The dimension theorem states that dimV=dimker(T)+dimIm(T)\dim \, V = \dim \, \ker(T) + \dim \, \mathrm{Im}(T)

4 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo