Systems of linear equations gave rise to linear algebra. But it has been a while since you began to learn this branch of mathematics. You have learned a lot of sophisticated techniques and concepts. At this point, solving a system of equations must seem easy. However, your new favorite tool, singular value decomposition, has much to say about it. Actually, thanks to it, you'll look at solving systems of linear equations in a slightly different way.
In this topic, you will learn how to calculate the solution of any system of equations. But wait, not all systems have solutions, and some even have infinite solutions. So, how can you talk about "the solution" of any system? Get ready to get a little dizzy and develop an infallible tool!
In the following, you'll be working with an matrix of rank . Also, consider a vector in .
Definition
As you already know, every matrix has an SVD. This decomposition allows you to understand the properties of the matrix deeply. Remarkably, the most useful form of SVD is the following:
Let's investigate the relationship between the decomposition and the inverse of the matrix. Remember that if is the square of size and all of its singular values are positive, then it's invertible, and its inverse is given by:
For a general matrix, only the first singular values are non-zero. So, you draw inspiration from the last expression to define the closest thing to the inverse of , namely, its pseudoinverse:
Note that the size of is . With the original form of SVD, if , then:where is the diagonal matrix whose non-zero entries are the reciprocals of the non-zero entries of and its zero columns are moved as rows at the bottom so that has size as opposed with whose size is .
This means that, although not every matrix is invertible, thanks to SVD, every matrix (even rectangular ones) has a pseudoinverse! As a first application, note that when the matrix is square of size and is invertible, its pseudoinverse coincides with the inverse. This is because, in such a case, which implies that:
In summary, you've just proved that:
Before delving into the properties of the pseudoinverse, let's see how you can easily calculate it.
Calculation
You already have plenty of practice calculating the SVD and are familiar with the advantages it brings. You can use this decomposition to construct the pseudoinverse easily as you just saw. Let's see an example. Take the matrix:
It's clear that it isn't invertible, but as it always has an SVD, it has a pseudoinverse. First of all, an SVD for is given by:
This means that:
Thus:
Up to this point, you know that every matrix has a pseudoinverse, which is easy to calculate after obtaining the SVD, and that when the original matrix is invertible, it reduces to the inverse. The pseudoinverse allows you to find a unique solution. Now, you will explore important applications of this new tool: systems of equations where there is no unique solution.
The best solution among infinite
If you try to solve the system given by:
you should get that the solutions are:
At first glance, it is not obvious which of all the possible solutions is the simplest. For instance, is perfectly valid, but the associated solution would look like , quite ugly. The pseudo-inverse can help:If you were to calculate , then is a solution of the system since .
But this solution has a curious quality. It is the smallest in the sense that its norm is lower than any other solution. This means that it is the solution closest to the origin.
Best of all, this is generally true:
The closest thing to a solution
Let's face it: very few rectangular systems of equations have solutions. This is even more common when there are more equations than variables. In such a case, there are too many conditions that a few variables have to meet.
Remind that the linear system can be rewritten as where are the columns of . This means that the system has a solution only when belongs to the subspace generated by the columns of , denoted by . When there is no solution, you could try to approximate as much as possible with some element of the . That is, the vector such that for any other .
The picture suggests that the vector is orthogonal to . You should think of the vector as the perpendicular projection of onto . It might seem tricky to find it, but the pseudoinverse to the rescue!
This optimization technique is known as least squares, and you will explore it in the next topic. For now, let's look at a simple example. Suppose you want to solve the system given by:
This system has no solution, so the best strategy is to approximate with the projection of onto . For this, verify that:
Therefore, the best approximation to among all vectors of is given by and this would be your best estimation of it.
Conclusion
- Every matrix has a pseudoinverse given by .
- When matrix is invertible, .
- If the linear system has infinitely many solutions, then is the shortest one.
-
If the linear system has no solutions, then gives you the closest thing to a solution in the sense that for any other , is closer to than is.