On the Determinant and the Trace of a matrix

Suppose you have a squared matrix A with n rows and columns. Is there a relationship between its eigenvalues and its determinant? And what about its trace?

Determinant of A.

Consider the determinant of \mid A- \lambda I \mid, which is a polynomial of degree n. The values of \lambda that solve the equation \mid A- \lambda I \mid = 0, are the eigenvalues of A, and are n as its degree.
We can write such a polynomial using its roots (its eigenvalues) as follows:
\mid A- \lambda I \mid = (\lambda - \lambda_1)(\lambda - \lambda_2)\cdots(\lambda - \lambda_n)

Its constant term is: (-1)^n \lambda_1\cdots\lambda_n,
but it’s also the value of \mid A- \lambda I \mid with \lambda = 0, which obviously is \mid A \mid.

So \mid A \mid = (-1)^n \lambda_1\cdots\lambda_n.

Trace of A.

If we develop the term of degree n-1 we obtain:
-(\lambda_1+\lambda_2+\cdots\+lambda_n).

For simplicity we consider the case of an 3x3 matrix:

\mid A- \lambda I \mid = \begin{bmatrix}a_{11}-\lambda & a_{12} & a_{13} \\a_{21} & a_{22}-\lambda &  a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}

It can be decomposed as follows:

\mid A- \lambda I \mid = \begin{bmatrix}a_{11} & a_{12} & a_{13} \\a_{21} & a_{22}-\lambda & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}+\begin{bmatrix}-\lambda & 0 & 0 \\a_{21} & a_{22}-\lambda & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}

Now we can apply the decomposition on the second row of the second matrix:

\begin{bmatrix}-\lambda & 0 & 0 \\a_{21} & a_{22}-\lambda & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}=\begin{bmatrix}-\lambda & 0 & 0 \\a_{21} & a_{22} & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}+\begin{bmatrix}-\lambda & 0 & 0 \\0 & -\lambda & 0 \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}

And finally we can apply the decomposition on the third row of the last matrix:


\begin{bmatrix}-\lambda & 0 & 0 \\0 & -\lambda & 0 \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}=\begin{bmatrix}-\lambda & 0 & 0 \\0 & -\lambda & 0 \\a_{31} & a_{32} & a_{33} \end{bmatrix}+\begin{bmatrix}-\lambda & 0 & 0 \\0 & -\lambda & 0 \\0 & 0 & -\lambda \end{bmatrix}

The matrices that give a contribute to the n-1 degree of \mid A- \lambda I \mid are:

D1=\begin{bmatrix}a_{11} & a_{12} & a_{13} \\a_{21} & a_{22}-\lambda & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}

D2=\begin{bmatrix}-\lambda & 0 & 0 \\a_{21} & a_{22} & a_{23} \\a_{31} & a_{32} & a_{33}-\lambda \end{bmatrix}

D3=\begin{bmatrix}-\lambda & 0 & 0 \\0 & -\lambda & 0 \\a_{31} & a_{32} & a_{33} \end{bmatrix}

So their n-1 degree contribute is: (a_{11}+a_{22}+a_{33}).

Generalizing the contribution is: (-1)^{n-1}(a_{11}+\cdots+a_{nn}).

Finally Trace(A)=\sum_1^n a_{ii}=(-1)^n\sum_1^n \lambda_{j}

Projection matrices and least squares

The lesson by Gilbert Strang “Projection matrices and least squares” is very nice and useful (you can find it here), but as often happens with him you have to demonstrate some passages alone.

Now the problem. 

Given a matrix A of real numbers with m rows and n columns, its columns span a vector subspace of  R^{m}, which corresponds to it in case m <= n and at least m columns are linearly independent. Given a vector b in R^{m} not necessarily belonging to the column space of A (C(A)), which is the nearest vector of C(A) to b?

And now we start to investigate…

First consideration.
We can restrict the columns of A just to those that are independents, because they are a basis for C(A) and so they span it all.

Second consideration.
Suppose that a vector p \epsilon C(A) exists such that e=b-p is orthogonal to C(A). In such a case would it be the solution we are looking for? Yes of course. Why?
The reason is very simple: consider any other vector of C(A), that we call p_1, then b=p_1+e_1. Is e_1 longer or shorter than e? It’s longer. Indeed b=p1+(p-p1)+e, but then e1=(p-p1)+e.
Now |e_1|^2=e_1 \cdot e_1=|p-p1|^2+|e|^2+2(p-p1) \cdot e
(where the \cdot stands for the inner product between vectors).
But e is orthogonal to C(A), so (p-p1) \cdot e=0. Finally |e_1|^2 is greater than |e|^2!

But now we have another question: does surely such a vector p exist?
From previous lessons we know that R^m is the union of 2 specific subspaces: the column space of A and the null space of A^T, which is orthogonal to C(A). So any vector  b belonging to R^m can be expressed in a unique way as a linear combination of the union of 2 basis: one from C(A) and one from N(A^T). But the combination from the first base is p and the other is e! So such a projection exists and is unique.

Now we want to find the projection. Is there a way to express it as a function of A and b?
Yes, there is. Consider the vector x of R^n such that Ax=p. We know that b=p+e and that e is orthogonal to C(A).
So e=Ax-b is orthogonal to C(A).This can be expressed using the inner product as follows: (Ax-b)Az=0 for any z belonging to R^n. But then it means that (Ax-b)^TAz = 0 for any z.
As a consequence it means that the transposed vector (Ax-b)^TA must be 0!

So x^TA^TA =b^TA or equivalently A^TAx=A^Tb. But we know that surely such an x exists and that it is unique too: in fact Ax = p, and we chose to limit the columns of A to the only independent ones.

But then it means that A^TA is invertible. So x=(AA^T)^{-1}A^Tb and p=A (AA^T)^{-1}A^Tb, where the matrix P=A (AA^T)^{-1}A^T is called the projection matrix: It allows to get the projection of any b vector in C(A).

An indirect but interesting result is that if the columns of A are independent then A^TA is invertible!