MathAlgebraLinear algebraMatrices

Calculating matrix inverse

11 minutes read

Although calculating inverses is usually complicated, you'll find out a universal algorithm to find them and also a general expression for it in terms of a lot of determinants.

To develop the algorithm, you'll need the elementary row operations that you already handle really well. Actually, the elemental matrices will be the building blocks. So let's not wait any longer, and go there!

Elementary matrices

As you know, applying an elementary row operation to a matrix AA is exactly the same as applying that operation to the identity (which is easy), forming a new matrix EE and then taking the product EAEA. For this reason, EE is called elementary.

To find the inverse of a matrix, you are going to multiply a bunch of elementary matrices. For now, the first thing to note about any elementary matrix is that it's always invertible. Moreover, you can easily calculate its inverse.

Take, for example, the elementary matrix resulting from swapping the first two rows from the identity matrix E=[0110]E = \begin{bmatrix}0 & 1 \\ 1&0 \end{bmatrix}. If you swap the rows again, you retrieve the identity matrix. But this is equivalent to premultiplying by the elementary matrix that swaps these rows, which in this particular case is precisely EE. This means that the inverse of EE is itself, which you can easily check:

EE=[0110][0110]=[1001]=IE E = \begin{bmatrix}0 & 1 \\ 1&0 \end{bmatrix} \begin{bmatrix}0 & 1 \\ 1&0 \end{bmatrix} = \begin{bmatrix}1 & 0 \\ 0&1 \end{bmatrix} = I

In general, the inverse of any elementary matrix is in fact the elementary matrix that applies the same elementary operation, but in the "inverse" direction. That is when the operation is:

  • To multiply a row by a number cc, the inverse is to multiply the same line by 1/c1/c.
  • To add a scalar multiple from one line to another, the inverse is simply to subtract the same scalar multiple from the first line to the second.
  • To exchange two rows, its inverse is itself.

For example, if you obtained the elementary matrix EE from the identity by multiplying a row by 55, then it is enough to multiply that same row by 1/51/5 to recover the identity. So, if FF is the elementary matrix of this last operation, then EF=1EF=1 and by the same reasoning FE=1FE=1. Thus F=E1F = E^{-1}.

You're almost done. The last thing you must know is that products and inverses get along well.

If AA and BB are invertible, then ABAB is also invertible, and its inverse is:(AB)1=B1A1(AB)^{-1} = B^{-1} A^{-1}

Proof

You only have to check out the definition of the inverse matrix. On one hand:(AB)(B1A1)=A(BB1)A1=AIA1=AA1=I(AB) (B^{-1} A^{-1}) = A (BB^{-1}) A^{-1} = A I A^{-1} = A A^{-1} = IAnd on the other hand:(B1A1)(AB)=B1(A1A)B=BIB1=BB1=I(B^{-1} A^{-1}) (AB) = B^{-1}( A^{-1}A)B = B I B^{-1} = B B^{-1} = I

\blacksquare

How to find the inverse

Let's put all your work together:

Let AA be a square matrix of size nn. Then AA is invertible if and only if its reduced row echelon form is the identity. That is, if there exists elementary matrices E1,E2,,EkE_1, E_2, \dots, E_k such that:EkE2E1A=IE_k \, \dots \,E_2 \, E_1 A =IIn this case, the inverse of AA is:

A1=EkE2E1A^{-1} = E_k \, \dots \,E_2 \, E_1

Proof

If EkE2E1A=IE_k \, \dots \,E_2 \, E_1 A =I, then by multiplying both sides by the left by (EkE2E1)1(E_k \, \dots \,E_2 \, E_1)^{-1} it's clear that A=(EkE2E1)1A = (E_k \, \dots \,E_2 \, E_1)^{-1} Therefore, by multiplying by EkE2E1E_k \, \dots \,E_2 \, E_1 by the right, A(EkE2E1)=(EkE2E1)1(EkE2E1)=IA (E_k \, \dots \,E_2 \, E_1) = (E_k \, \dots \,E_2 \, E_1)^{-1} (E_k \, \dots \,E_2 \, E_1) = I. Thus, AA is invertible, and A1=EkE2E1A^{-1} = E_k \, \dots \,E_2 \, E_1.

For the rest of the proof, you need the following lemma:

Let BB be an invertible matrix and E1,E2,,EkE_1, E_2, \dots, E_k be elementary matrices.

If the matrix TT defined as T=EkE2E1BT = E_k \, \dots \,E_2 \, E_1 B is in row-echelon form, then all of the entries of its main diagonal are different from zero.

Proof

Suppose that there is an entry in the main diagonal of T which is zero.

That is, there is some i{1,2,,n}i \in \{1, 2, \dots, n\} such that tii=0t_{ii} = 0.

Recall that TT is in row-echelon form. Think about the rows that are below the I-th one. They must have their first non--zero entry to the right of the I-th position. But as T is square, this causes the last row can't have a non-zero entry. Thus the last row of T is the zero vector.

Now, take any square matrix CC. What does the last row of TC look like? As the last row of TT is zero, the last row of TCT \, C is the zero vector. In consequence, TCT \, C cannot be the identity matrix II. So that CC isn't the inverse of TT. Finally, as CC is an arbitrary square matrix, this result holds for every square matrix. Thus TT cannot be invertible, which is a contradiction.

\blacksquare

Great, let's now assume that if AA is invertible, first put it into a row-echelon form by applying some ll elementary row operations. By the lemma, all of the entries in the main diagonal of the matrix ElE2E1AE_l \, \dots \,E_2 \, E_1 A are non-zero. Then, by applying some more elementary row operations, you can convert all these entries in 11's. After that, you can easily continue applying elementary row operations in order to transform into 11's all of the entries out of the diagonal. Hence the reduced row echelon form of AA is II.

Let's say that the total number of elementary row operations needed was kk. Then there exist elementary matrices E1,E2,,EkE_1, E_2, \dots, E_k such that EkE2E1A=IE_k \, \dots \,E_2 \, E_1 A =I. Hence by, as AA is invertible, A1=IA1=EkE2E1AA1=EkE2E1A^{-1} = I A^{-1} = E_k \, \dots \,E_2 \, E_1 A A^{-1} = E_k \, \dots \,E_2 \, E_1.

\blacksquare

Although it may not seem so at first sight, this result gives us an infallible algorithm to find the inverse of any invertible matrix:

  1. Find the reduced row echelon form of AA.
  2. If the result is the identity, you can move forward, otherwise AA is not invertible.
  3. Save all the elementary matrices E1,E2,,EkE_1, E_2, \dots, E_k used in the previous process.
  4. Then A1A^{-1} is the product of these elementary matrices in the reverse order in which they were used. That is A1=EkE2E1A^{-1} = E_k \, \dots \,E_2 \, E_1.

But wait a minute. The product EkE2E1E_k \, \dots \,E_2 \, E_1 is the same as (EkE2E1)I(E_k \, \dots \,E_2 \, E_1) I. This means that the operations that are applied to AA in order to transform it into II then have to be applied again to II in the same order to convert it into A1A^{-1}. So, why not apply the operations at the same time on both matrices?

First, join the two matrices into an augmented one by putting II to the right of AA:[AI][A \, | \, I]After that, start applying elementary row operations to the big matrix in order to row reduce AA – this means that every operation applied to AA is immediately applied to II. At the end, when on the left AA has become II, on the right II will have become A1A^{-1}:

[IA1][I \, | \, A^{-1}]For example, the inverse of the matrixA=[2314]A = \begin{bmatrix}2&3 \\ 1&4 \end{bmatrix} is A1=[4/53/51/52/5]A^{-1} = \begin{bmatrix}4/5&-3/5 \\ -1/5 & 2/5 \end{bmatrix}. Our method establishes that first of all, you have to build the augmented matrix [AI][A \, | \, I] and apply operations till AA is converted into II and at that moment the original II will be A1A^{-1}. You can visualize it in the following way:

The process of finding the inverse

Better let's see the method in action!

An invertible matrix

Suppose you want to find the inverse of the following matrix:

A=[2314]A = \begin{bmatrix}2&3 \\ 1&4 \end{bmatrix}The first step is to define the bigger matrix:

[AI]=[23101401] [A∣I] = \left[\begin{array}{ll|ll} 2 & 3 & 1 & 0 \\ 1 & 4 & 0 & 1 \end{array}\right]Now compute the reduced row echelon form of AA. You can start by interchanging the two rows:

[14012310]\left[\begin{array}{ll|ll} 1 & 4 & 0 & 1 \\ 2 & 3 & 1 & 0 \end{array}\right]Now, subtract two times the first row from the second one:

[14010512]\left[\begin{array}{ll|ll} 1 & 4 & 0 & 1 \\ 0 & -5 & 1 & -2 \end{array}\right]Now divide the second row by 5-5:

[1401011/52/5]\left[\begin{array}{ll|ll} 1 & 4 & 0 & 1 \\ 0 & 1 & -1/5 & 2/5 \end{array}\right]Finally, subtract 44 times the second row from the first one:

[104/53/5011/52/5]\left[\begin{array}{ll|ll} 1 & 0 & 4/5 & -3/5 \\ 0 & 1 & -1/5 & 2/5 \end{array}\right]You've just transformed AA into II. So, the desired inverse is:A1=15[4312]A^{-1} = \frac{1}{5} \begin{bmatrix}4&-3 \\ -1&2 \end{bmatrix}

The inverse in terms of determinants

The determinant is a really versatile tool, and you can even use it to build the inverse of a matrix AA. The first step is to build a new matrix with all the possible cofactors of AA:

cof(A)=[c11c12c1nc21c22c2ncn1cn2cnn]Rn×n\operatorname{cof}(A) = \begin{bmatrix} c_{11} & c_{12} & \cdots & c_{1 n} \\ c_{21} & c_{22} & \cdots & c_{2 n} \\ \vdots & \vdots & \ddots & \vdots \\ c_{n 1} & c_{n 2} & \cdots & c_{n n} \end{bmatrix} \in \mathbb{R}_{n\times n}Then you only have to transpose it and divide every entry by the determinant of the original matrix. The whole process is the following:

If AA is invertible, then:

A1=1det(A)cof(A)TA^{-1}=\frac{1}{\operatorname{det}(A)} \operatorname{cof}(A)^T

Proof

As always, proofs are optional. In this case, you should be familiarized with the properties of the determinant.

First, let's denote X=A1X = A^{-1}. The strategy is to get every entry of XX. You know that AX=IAX=I. Let j{1,,n}j \in \{1, \dots, n \}. On one hand, by the matrix product properties, the jj-th column of AXA X is the product between AA and the jj-th column of XX, that is AXjA X_j. On the other hand, as AX=IAX=I, the jj-th column of AXA X is just the jj-th column of II which is simply eje_j. Thus:AXj=ejA X_j = e_jNice! This is a simple system of linear equations with unknown vector XjX_j, so you can apply the Cramer's Rule, to get every entry of XjX_j. Then, for every i{1,,n}i \in \{1, \dots, n \}:

xij=det(A(i))det(A)=det([A1Ai1ejAi+1An])det(A)x_{ij}= \frac{ \det(A_{(i)})}{\det(A)} = \frac{ \det([A_1 | \dots | A_{i-1}| e_j |A_{i+1} | \dots | A_n])}{\det(A)}

Here comes the important part. Look at the numerator. It is the determinant of a matrix whose ii-th column is the vector eje_j. That's a column with a bunch of zeros. Actually, all its entries are zero, except the jj-th which is 11. Therefore, you can expand the determinant along this column, which means that only one term will survive:

det([A1Aj1ejAj+1An])=cji=(1)i+jmji\det([A_1 | \dots | A_{j-1}| e_j |A_{j+1} | \dots | A_n]) = c_{ji} = (-1)^{i+j} m_{ji}But the minor mijm_{ij} of matrix [A1Aj1ejAj+1An][A_1 | \dots | A_{j-1}| e_j |A_{j+1} | \dots | A_n] is exactly the minor mijm_{ij} of matrix AA because in both cases, removing the ii-th row or the jj-th column, the result is exactly the same (after all, both matrices are the same except for the jj-th column, which is exactly the one you removed!). Thus

xij=det([A1Ai1ejAi+1An])det(A)=(1)i+jmjidet(A)x_{ij} = \frac{ \det([A_1 | \dots | A_{i-1}| e_j |A_{i+1} | \dots | A_n])}{\det(A)} = \frac{(-1)^{i+j} m_{ji}}{\det(A)}Finally, by the definition of the transpose, mjim_{ji} is the entry (i,j)(i,j) of cof(A)T\operatorname{cof}(A)^T

\blacksquare

As a simple but curious example, you can use this formula to find the inverse of any invertible matrix of size 22. After easily computing all the possible cofactors of any matrix A=[abcd]A=\begin{bmatrix} a & b \\ c & d \end{bmatrix}, you get that cof(A)=[dcba]\operatorname{cof}(A)=\begin{bmatrix} d & -c \\ -b & a \end{bmatrix}. Thus:

A1=1det(A)cof(A)T=1adbc[dbca]A^{-1}=\frac{1}{\operatorname{det}(A)} \operatorname{cof}(A)^T=\frac{1}{a d-b c} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}Let's apply it to the matrix whose inverse you have already calculated, A=[2314]A = \begin{bmatrix}2&3 \\ 1&4 \end{bmatrix}. Then det(A)=83=5\det(A) = 8 - 3 = 5. Hence:

A1=15[4312]A^{-1} = \frac{1}{5} \begin{bmatrix}4&-3 \\ -1&2 \end{bmatrix}This is exactly the same result as before! You didn't need to apply row operations or other cumbersome operations to calculate it! Nice.

Conclusion

  • Every elementary matrix is invertible, and its inverse is also elementary.
  • The product of invertible matrices AA and BB is also invertible, and (AB)1=B1A1(AB)^{-1} = B^{-1} A^{-1}.

  • AA is invertible if and only if its reduced row echelon form is II. In this case, its inverse is the result of applying on II exactly the same elementary row operations that you used to transform AA into II.

  • When the matrix is invertible, A1=1det(A)cof(A)TA^{-1}=\frac{1}{\operatorname{det}(A)} \operatorname{cof}(A)^T. Where cof(A)\operatorname{cof}(A) is the matrix of cofactors of AA.

  • For any invertible 2×22\times 2 matrix A=[abcd]A=\begin{bmatrix} a & b \\ c & d \end{bmatrix}, its inverse is obtained by simply switching the entries in the main diagonal, changing the signs for the other entries, and finally dividing every entry by the determinant, 1adbc[dbca]\frac{1}{a d-b c} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}.

5 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo