Positive definite matrices are a special family of symmetric matrices that behave like positive numbers. In this topic, you will learn about their geometry and a way to easily recognize them.
They're the last piece for building the singular value decomposition. You'll learn about their most useful properties and even define the square root of a matrix!
Definition
Let's start with the definition. Consider a symmetric matrix of size . It is called:
- positive definite (PD) if all of its eigenvalues are strictly positive
- positive semidefinite (PSD) if all of its eigenvalues are non-negative
For instance, take the following matrices:
The eigenvalues of the first one are and , so it's PD, while those of the second one are and , hence it's PSD.
Although the set of symmetric matrices seems very small, it contains another family of even more special matrices with new special properties. But if it's such a small set, why bother studying them? After all, symmetric matrices are not even common in practice.
No worries. At the end of the topic, you'll discover that you can use any matrix (not necessarily square) to build a PSD one. Furthermore, you can use its properties to analyze the original matrix.
Since PD (PSD) matrices have positive (non-positive) eigenvalues, a good analogy is that they play a role similar to that of positive (non-positive) numbers. This analogy will appear naturally in the properties of these matrices. For now, note that since they're symmetric, they have a spectral decomposition:
Here is orthogonal and is diagonal. But now the decomposition is even better since the diagonal entries of (the eigenvalues of ) are positive (non-positive). Let's take a closer look to the geometry of these new matrices. In the following let denote a PD (PSD) matrix
Geometry
A good way to geometrically measure the force with which deforms space is to calculate the dot product between a vector and its image under . Therefore, define the quadratic form associated with as the function given by:Using the first matrix from the previous section,
you can easily compute its quadratic form as:
A quadratic form is called positive definite (positive semidefinite) when () for every . This name is no coincidence since a symmetric matrix is PD (PSD) if and only if its quadratic form is PD (PSD). Actually, when is PD, the function defines an inner product in .
Now, notice that where is the angle between and . So, when is PD (PSD), then () implies that (). This in turn means that (), hence the angle is acute (or perpendicular). Thus you can think of PD (PSD) matrices as symmetric matrices that send the vectors to close places in the sense that the angle between them does not exceed .
Ellipses and the spectral decomposition
But let's go a step further by illustrating the power of our theoretical tools. Go back to the quadratic form you just calculated and consider all vectors that satisfy that is:
If you remember anything about conics, you might recognize that this equation defines an ellipse.
However, the problem is that the "cross" term makes its geometric interpretation very difficult. At this point, you're ready to use spectral decomposition to remove that cross term. Construct a spectral decomposition for with the pieces and . As you can reconstruct the quadratic form just as:
It's easy to see that , so:
Great, no more cross terms. In fact, this trick always works: every PD matrix represents an ellipse (or its high-dimensional analogous) with cross terms that can be deleted thanks to the spectral decomposition. Also, take a look at columns of , they point in the directions of the principal axes of the ellipse.
How to detect positive definiteness
You can tell if a matrix is symmetric or diagonal with just a glance, but to know if it's PD (PSD), you have to calculate all its eigenvalues, which isn't very straightforward.
The most popular method to facilitate this task is called Sylvester's criterion. It consists of calculating determinants in total. Specifically, for each number between and , you have to compute the top-left minor of . For example, for a matrix of size the top-left minors look like this:
You already know that the matrix is PD. Its top-left minors are and , so our test is consistent. Now take a bigger matrix
Its top-left minors are all positive:
so, the matrix is PD.
Leveraging PD matrices
Any real number satisfies that . Something very similar happens with matrices, and it is the reason why you're studying PD (PSD) matrices: For any invertible (square) matrix , the matrix is PD (PSD).
What is more surprising is that the converse is also true. If a matrix is PD (PSD), then there exists an invertible (square) such that . In that sense, it's as if were the square root of , but there's a problem. It's possible to find several matrices with the property that . Despite this fact, all of these square-root candidates are connected to each other. This result is known as the orthogonal freedom, and it states that if then there exists an orthogonal matrix such that . This result may seem boring to you, but it is the key to the construction of the spectral decomposition.
The number has two possible square roots, and , but among them is the only one that is still a positive number. Indeed something very similar occurs with any PSD matrix , since there's always a PSD matrix with the property that . It's known as the principal root of and is denoted as . You can calculate it easily thanks to the spectral decomposition, because if , then by defining the matrix as the matrix whose entries are the square roots of those of you get that:
Finally, just as you define the absolute value of any number as , the same goe for any square matrix. So, if is a square matrix, then is PSD and therefore has a square root. Thus, you define .
Actually, this matrix is intimately connected with since they have several properties, but it has the advantage of being PSD. The definitive connection between them is known as the polar decomposition. It establishes that for any square matrix there exists an orthogonal one such that:
Conclusion
Let's review everything you've learned so far. Consider a symmetric square matrix .
- is positive definite PD (positive semidefinite PSD) if all of its eigenvalues are positive (non-negative).
-
The quadratic form associated with is the function given by .
-
is PD (PSD) if and only if its quadratic form is PD (PSD).
-
is PD if and only if every top-left minor is positive.
-
If is PSD, then its principal root is the unique PSD matrix such that .
-
For every square matrix , the new matrix is PSD. Then, the absolute value of is