The lesson by Gilbert Strang “Projection matrices and least squares” is very nice and useful (you can find it here), but as often happens with him you have to demonstrate some passages alone.
Now the problem.
Given a matrix A of real numbers with
rows and
columns, its columns span a vector subspace of
, which corresponds to it in case
and at least
columns are linearly independent. Given a vector b in
not necessarily belonging to the column space of
(
), which is the nearest vector of
to
?
And now we start to investigate…
First consideration.
We can restrict the columns of
just to those that are independents, because they are a basis for
and so they span it all.
Second consideration.
Suppose that a vector
exists such that
is orthogonal to
. In such a case would it be the solution we are looking for? Yes of course. Why?
The reason is very simple: consider any other vector of
, that we call
, then
. Is
longer or shorter than
? It’s longer. Indeed
, but then
.
Now
(where the
stands for the inner product between vectors).
But
is orthogonal to C(A), so
. Finally
is greater than
!
But now we have another question: does surely such a vector
exist?
From previous lessons we know that
is the union of 2 specific subspaces: the column space of A and the null space of
, which is orthogonal to
. So any vector
belonging to
can be expressed in a unique way as a linear combination of the union of 2 basis: one from
and one from
. But the combination from the first base is
and the other is
! So such a projection exists and is unique.
Now we want to find the projection. Is there a way to express it as a function of
and
?
Yes, there is. Consider the vector
of
such that
. We know that
and that
is orthogonal to
.
So
is orthogonal to
.This can be expressed using the inner product as follows:
for any
belonging to
. But then it means that
for any
.
As a consequence it means that the transposed vector
must be 0!
So
or equivalently
. But we know that surely such an
exists and that it is unique too: in fact
, and we chose to limit the columns of A to the only independent ones.
But then it means that
is invertible. So
and
, where the matrix
is called the projection matrix: It allows to get the projection of any
vector in
.
An indirect but interesting result is that if the columns of A are independent then
is invertible!