MathAlgebraLinear algebraLinear operators

Introduction to linear operators

8 minutes read

Vector spaces by themselves have a lot of useful properties. But it's until we connect them through special functions that we discover their deepest qualities and find the strongest results.

In this topic, we'll begin to know these functions, their most important properties, and some examples. Later we will investigate more elaborate results and their relationship with the matrices. As in previous topics, we'll denote by VV and WW two vector spaces.

Preserving the structure

A function f:ABf: \, A \rightarrow B connects sets, so it acts as a bridge between them. But vector spaces are not simple sets, they have an algebraic structure.

Basic function between sets

This suggests that we should be primarily focused on the functions that preserve such a structure, i.e. that get along well with the sum of vectors and the product of a vector by a scalar. These functions are called linear transformations (usually denoted by a more dramatic TT) and are the core of linear algebra.

A function that preserves the structure

From now on in this course, practically all the concepts and results will involve linear transformations in one way or another.

  • Central notions such as matrices, eigenvectors, and orthogonal projections are closely related to these functions.
  • We can also use them to calculate the dimensions of spaces, interpret many geometric transformations, change coordinates, exploit the basis of spaces, simplify many calculations, approximate more complicated functions, fully understand systems of linear equations, and much more.

We are now ready to introduce the definition of a linear transformation and explore its immediate properties.

Definition and basic properties

A function T:VWT :\, V\rightarrow W is a linear transformation if for all v,uVv,u\in V and for every λR\lambda \in \mathbb{R}:

  1. T(v+u)=T(v)+T(u)T(v+u)=T(v)+T(u)
  2. T(λv)=λT(v)T(\lambda v)= \lambda T(v)

The first point means that if you have calculated the value of TT in both vv and uu, then we can reduce the calculation of TT in the sum to simply the sum of those values. In the same way, the second point establishes that when transforming vv under TT, you automatically know the transformation of any scalar multiple of vv.

The two properties are merged when analyzing the interaction of linear transformations with linear combinations:

T(λ1v1+λ2v2++λnvn)=T(λ1v1)+T(λ2v2)++T(λnvn)=λ1T(v1)+λ2T(v2)++λnT(vn)\begin{align*} T(\lambda_1 v_1+ \lambda_2 v_2 + \dots +\lambda_n v_n) &= T(\lambda_1 v_1)+ T(\lambda_2 v_2) + \dots + T(\lambda_n v_n) \\ &= \lambda_1 T(v_1)+ \lambda_2 T( v_2) + \dots +\lambda_n T(v_n) \end{align*}

It's time to know a fundamental property of linear transformations: they map the 00 of VV to the 00 of WW, that is, T(0)=0T(0)=0. You can verify this quickly, since T(0)=T(0+0)=T(0)+T(0)T(0)=T(0+0)=T(0)+T(0), so subtracting T(0)T(0) from both sides of the equation we get that T(0)=0T(0)=0.

This gives us a simple test to determine if a transformation is not linear:

If T(0)0T(0) \neq 0, then TT is not linear.

The most important linear transformations are those that map a space to itself. Their properties are so special that they deserve their own name: If T: VVT:\ V\rightarrow V is linear, then we'll call it a linear operator.

Examples of linear transformations

Undoubtedly, the simplest linear transformation is the zero constant function O\mathscr{O}, which is given by O(v)=0\mathscr{O}(v)=0. It is immediate that it is linear since O(v+u)=0=0+0=O(v)+O(u)\mathscr{O}(v+u)=0=0+0=\mathscr{O}(v)+\mathscr{O}(u) and O(λv)=0=λ0=λO(v)\mathscr{O}(\lambda v)= 0 = \lambda 0=λ\mathscr{O}(v).

Another well-known function is the identity function I:VVI: \, V \rightarrow V, which is simply given by I(v)=vI(v)=v. It's easy to see that it's also linear, so you can quickly check it. It is a linear operator.

Let us consider a slightly more difficult function T:R2R2T: \, \mathbb{R}^2 \rightarrow \mathbb{R}^2 given by T(x,y)=(x+y,xy)T(x, y)=(x+y, −x−y). Let's prove that this function is linear:

Proof that T(x,y)=(x+y,xy)T(x, y)=(x+y, −x−y) is linear

Let v=(x1,y1)v=(x_1, y_1) and u=(x2,y2)u=(x_2, y_2), then:

T(v+u)=T(x1+x2,y1+y2)=((x1+x2)+(y1+y2),(x1+x2)(y1+y2))=(x1+y1,x1y1)+(x2+y2,x2y2)=T(v)+T(u)\begin{align*} T(v+u) &=T(x_1+x_2, y_1+ y_2) \\ &=\left((x_1+x_2)+(y_1+y_2), -(x_1+x_2)- (y_1+y_2)\right) \\ &= (x_1+ y_1, -x_1 - y_1) + (x_2+ y_2, -x_2 - y_2) \\ &=T(v)+T(u) \end{align*}

T(λv)=T(λx1,λy1)=(λ(x+y),λ(xy))=λ(x+y,xy))=λT(v)T(\lambda v) =T(\lambda x_1, \lambda y_1) = (\lambda (x + y), \lambda (-x - y)) = \lambda (x + y, -x - y)) =\lambda T(v)

In future topics, you will learn to analyze linear transformations geometrically, but for now, just note this operator converts the entire R2\mathbb{R}^2plane into the line generated by the vector (1,1)(1,−1).

A linear transformation that collapses a plane into a line

Now let's look at a really important example. Let's take an m×nm \times n matrix AA. We can define a function in Rn\mathbb{R}^n given by the product AvAv, which is a vector in Rm\mathbb{R}^m. This function is the transformation associated with AA and is denoted as:

LA: RnRm,LA(v)=AvL_A: \ \mathbb{R}^n→\mathbb{R}^m, \qquad L_A(v)=Av

From the properties of the matrix product, it is clear that it's a linear transformation:LA(v+u)=A(v+u)=Av+Au=LA(v)+LA(u)L_A(v+u)=A(v+u)=Av+Au=L_A(v)+L_A(u) and LA(λv)=A(λv)=λ(Av)=λLA(v)L_A(\lambda v)=A(\lambda v)=\lambda (Av)=\lambda L_A(v).

Now we look at a slightly more extravagant transformation. If MnM_n denotes the set of square n×nn\times n matrices, then we can define the trace function tr:MnRtr:M_n \rightarrow \mathbb{R} given by tr(A)=i=1naiitr(A)= \sum_{i=1}^n a_{ii}. We can easily check that it's linear:

Proof that trtr is linear

tr(A+B)=Σi=1n(aii+bii)=Σi=1n(aii)+Σi=1n(bii)=tr(A)+tr(B)tr(A+B) =\Sigma_{i=1}^n(a_{ii}+b_{ii}) =\Sigma_{i=1}^n(a_{ii}) + \Sigma_{i=1}^n(b_{ii}) = tr(A) + tr(B)

tr(λA)=Σi=1n(λaii)=λΣi=1n(aii)=λtr(A)tr(\lambda A) =\Sigma_{i=1}^n(\lambda a_{ii}) = \lambda \Sigma_{i=1}^n(a_{ii}) = \lambda tr( A)

The transpose defines a transformation from the set of m×nm×n matrices to the set of n×mn\times m matrices. It's easy to see that it's linear thanks to its properties, since (A+B)t=At+Bt(A+B)^ t=A^t+B^t and (λA)t=λ(At)(\lambda A)^t=\lambda(A^t).

But let's not rush, not all functions are linear, actually only a few are. Think about the line f(x)=x+1f(x)=x+1. It is clear that f(x+y)=x+y+1x+y+2=f(x)+f(y)f(x+y)=x+y+1\neq x+y+2=f(x)+f(y), so it cannot be linear. Another example is the cosine function, since we know that cos(0)=1\cos⁡(0)=1, it's impossible for it to be linear due to the last result of the previous section.

Relationship with the basis

A vector space is completely determined by a basis {v1,v2,,vn}\{v_1,v_2,…,v_n\}. This means that any vector uu can be uniquely written as a linear combination of the basis vectors: u=λ1v1+λ2v2++λnvnu=\lambda_1v_1 + \lambda_2 v_2 + \dots + \lambda_n v_nBut wait a second, linear transformations get on perfectly with linear combinations. So let's see how it interacts with the basis: T(u)=T(λ1v1+λ2v2++λnvn)=λ1T(v1)+λ2T(v2)++λnT(vn)\begin{align*} T(u) &=T(\lambda_1 v_1+ \lambda_2v_2+ \dots +\lambda_nv_n)\\ &=\lambda_1 T(v_1)+\lambda_2T(v_2)+⋯+\lambda_nT(v_n)\\ \end{align*}

Bingo: to know the value of TT in uu it is enough to know the values of TT in the basis! So if we had another vector z=μ1v1+μ2v2++μnvnz= \mu_1v_1 + \mu_2 v_2 + \dots + \mu_n v_n we could reuse this information T(z)=μ1T(v1)+μ2T(v2)+.+μnT(vn)T(z)= \mu_1 T(v_1)+ \mu_2 T(v_2) + \dots. + \mu_n T(v_n), and only the coefficients are different.

In summary, having computed the value of TT in a few vectors we have virtually computed all of its values. And this also means that if we want to build a linear transformation we only have to choose its values in the basis.

A linear transformation interacting with a basis

TT is completely determined by the values it takes in the basis. Just as the basis describes all the space, it also describes any linear transformation on it.

This is very useful because if you have a linear transformation that is very complicated to evaluate, you would only have to calculate its values in the basis and then derive a general formula for every vector.

Let's see a simple example. We'll reconstruct a linear transformation T:R2R2T: \, \mathbb{R}^2 \rightarrow \mathbb{R}^2 . The only thing we know is that T(e1)=(0,2)T(e_1) = (0, 2) and T(e2)=(3,1)T(e_2) =(3,1). Then we obtain that in general:

T(x,y)=T(xe1+ye2)=xT(e1)+yT(e2)=x(0,2)+y(3,1)=(3y,2x+y)T(x, y)=T(xe_1+ye_2)=xT(e_1)+yT(e_2)=x(0, 2)+y(3, 1)=(3y, 2x+y)

The transformation converts the rectangular grid into a curvilinear one

Thanks to the flexibility of linear transformations, we can sometimes easily construct their inverses. In the previous example, it is enough to denote w1=(0,2)w_1= (0, 2) and w2=(3,1)w_2= (3, 1) and construct the transformation S:R2R2S: \, \mathbb{R}^2 \rightarrow \mathbb{R}^2 given by S(w1)=e1S(w_1)=e_1 and S(w2)=e2S(w_2)=e_2. Thus, SS "returns" each vector to its place of origin and transforms the curvilinear space into an original rectangular.

Conclusion

Let{v1,v2,,vn}\{v_1,v_2,…,v_n\} be a basis for VV.

  • Linear transformations are functions that preserve the structure of vector spaces.
  • A function T:VWT :\, V\rightarrow W is a linear transformation for all v,uVv,u\in V and for every λR\lambda \in \mathbb{R}

    1. T(v+u)=T(v)+T(u)T(v+u)=T(v)+T(u)

    2. T(λv)=λT(v)T(\lambda v)= \lambda T(v)

  • When T:VVT :\, V\rightarrow V is linear it's called a linear operator.

  • If TT is linear, then T(0)=0T(0)=0.

  • Every m×nm \times n matrix AA generates a linear transformation defined by: LA: RnRm,LA(v)=AvL_A: \ \mathbb{R}^n→\mathbb{R}^m, \, L_A(v)=Av

  • The transformation is determined by its values in the basis. For every vector uVu \in V:

T(u)=λ1T(v1)+λ2T(v2)++λnT(vn)\begin{align*} T(u) &=\lambda_1 T(v_1)+\lambda_2T(v_2)+\dots+\lambda_nT(v_n)\\ \end{align*}

6 learners liked this piece of theory. 4 didn't like it. What about you?
Report a typo