March 20, 2019

Linear Maps

Linear Maps
  1. Linear Maps
  2. Basic Properties
  3. Rank and Nullity

Sooo somewhere along the line I dropped the arrow notation for vectors. I was using $\v$ instead of $v$ to denote vectors to keep things clean and avoid confusion, but it was inevitable that this would eventually stop happening because it was a lot of work on my end.

Linear Maps

Just as we decided to study continuous functions between topological spaces and homomorphisms between groups, much of linear algebra is dedicated to the study of linear maps between vector spaces.

If you've been paying close attention, you may have noticed that vector spaces are really just groups to which we have added scalar multiplication. In the same vein, linear maps are really just homomorphisms with an additional property which makes them compatible with scalar multiplication.

Definition. A linear map (or linear transformation) between two vector spaces $U$ and $V$ over a field $\F$ is a function $T:U\to V$ for which the following properties hold:

Additivity
For any vectors $u_1,u_2\in U$, we have that $T(u_1+u_2) = T(u_1) + T(u_2)$.

Homogeneity
For any vector $u\in U$ and any scalar $a\in\F$, we have that $T(au)=aT(u)$.

It is conventional to write $Tu$ to mean $T(u)$, when no confusion arises from this shorthand. The reason for this notation will become apparent shortly.

It is important to note that on the left side of each equation in this definition, the vector addition and scalar multiplication are being done in $U$. That is, we are applying $T$ to the sum $u_1+u_2\in U$ and to the scalar product $au\in U$. On the right, the addition and multiplication are being done in $V$. That is, we are first applying $T$ to $u_1$ and $u_2$ separately and adding the result in $V$, or applying it to $u$ and multiplying the result in $V$. So linear maps are precisely those functions for which it does not matter whether we add/multiply before or after applying the function.

The reason why a linear map can only be defined between vector spaces over the same field $\F$ is due to the homogeneity condition. That's because it depends on the field's product operation being compatible with both vector spaces.

Example. We define a "zero map" $0:U\to V$ between any two vector spaces by $0(u)=0$ for all vectors $u\in U$. This is obviously a linear map because

$$\begin{align}
0(u_1 + u_2) &= 0 \\
&= 0 + 0 \\
&= 0(u_1) + 0(u_2),
\end{align}$$

which shows that it satisfies additivity, and

$$\begin{align}
0(au) &= 0 \\
&= a0 \\
&= a0(u),
\end{align}$$

which shows that it satisfies homogeneitry.

Note that I am making no distinction between the zero vector in the domain and the zero vector in the codomain. Nonetheless, it is important to realize that they may be different objects. For example, in the case of a linear map from $\R^n$ to $\P(\F)$, the zero vector in $\R^n$ is the $n$-tuple $(0,0,\ldots,0)$ and the zero vector in $\P(\F)$ is the zero polynomial $p(x)=0$.

What's even worse though it that I'm also using the symbol $0$ to represent the zero map. Maybe I just worship chaos.

Example. The identity map $I:V\to V$ on any vector space, defined as usual by $Iv=v$ for all vectors $v\in V$, is a linear map. To see that it is additive, note that, for any vectors $v_1, v_2\in V$,

$$\begin{align}
I(v_1+v_2) &= v_1+v_2 \\
&= Iv_1 + Iv_2.
\end{align}$$

To see that it is homogeneous, note that, for any scalar $a\in\F$ and any vector $v\in V$,

$$\begin{align}
I(av) &= av \\
&= aIv.
\end{align}$$

The identity map is super important in linear algebra.

Example. The map $T:\R^2\to\R$ defined by $T(x,y)=x+y$ for any vector $(x,y)\in\R^2$ is a linear map.

This map is additive because

$$\begin{align}
T((x_1,y_1)+(x_2,y_2)) &= T(x_1+x_2,y_1+y_2) \\
&= x_1+x_2+y_1+y_2 \\
&= x_1+y_1+x_2+y_2 \\
&= T(x_1,y_1) + T(x_2,y_2).
\end{align}$$

It is homogeneous because

$$\begin{align}
T(a(x,y)) &= T(ax,ay) \\
&= ax+ay \\
&= a(x+y) \\
&= aT(x,y).
\end{align}$$

Example. The map $T:\R^3\to\P_2(\R)$ defined by $T(a,b,c)=a+bx+(a-b)x^2$ is a linear map.

It's additive because

$$\begin{align}
T((a_1,b_1)+(a_2,b_2)) &= T(a_1+a_2,b_1+b_2) \\
&= a_1+a_2 + (b_1+b_2)x + (a_1+a_2 - b_1-b_2)x^2\\
&= a_1 + b_1x + (a_1-b_1)x^2 + a_2 + b_2x + (a_2-b_2)x^2\\
&= T(a_1,b_1) + T(a_2,b_2).
\end{align}$$

And it's homogeneous because

$$\begin{align}
T(k(a,b)) &= T(ka,kb) \\
&= ka + kbx + (ka-kb)x^2 \\
&= k(a+bx+(a-b)x^2) \\
&= kT(a,b).
\end{align}$$

Basic Properties

What follows are some basic properties of linear maps.

The first result says that linear maps take the zero vector in the domain to the zero vector in the codomain.

Theorem. If $T:U\to V$ is a linear map, then $T(0)=0$.

Proof. We proceed directly via the following computation:

$$\begin{align}
T(0) &= T(0) + 0 \\
&= T(0) + T(0) - T(0) \\
&= T(0 + 0) - T(0) \\
&= T(0) - T(0) \\
&= 0.
\end{align}$$

Linear maps do a lot more than that though. They also map subspaces to subspaces!

Theorem. If $T:U\to V$ and $W$ is a subspace of $U$ then $T[W]$ is a subspace of $V$.

Proof. We just need to show that $T[W]$ contains the zero vector and is closed under vector addition and scalar multiplication.

Since $W$ is a subspace, it contains the zero vector in $U$. Thus, since $T(0)=0$, it follows that $T[W]$ contains the zero vector in $V$.

Next, suppose $v_1,v_2\in T[W]$. Then $v_1=Tw_1$ and $v_2=Tw_2$ for some vectors $w_1,w_2\in W$. But since $W$ is a subspace, it contains $w_1+w_2$. Thus,

$$\begin{align}
v_1 + v_2 &= Tw_1 + Tw_2 \\
&= T(w_1 + w_2) \\
&\in T[W].
\end{align}$$

Lastly, suppose $v\in T[W]$ and $a\in\F$. Then $v=Tw$ for some $w\in W$. But since $W$ is a subspace, it contains $aw$. Thus,

$$\begin{align}
av &= aTw \\
&= T(aw) \\
&\in T[W].
\end{align}$$

It follows that $T[W]$ is a subspace of $V$, as desired.

The kernel of any linear transformation is a subspace of the domain.

Theorem. If $T:U\to V$ then $\ker T$ is a subspace of $U$.

Proof. Again, we need to show that $\ker T$ contains the zero vector and is closed under vector addition and scalar multiplication.

Since $T(0)=0$, certainly $0\in\ker T$ by definition.

Next, suppose $u_1,u_2\in\ker T$. Then $Tu_1=0$ and $Tu_2=0$. Thus,

$$\begin{align}
T(u_1 + u_2) &= Tu_1 + Tu_2 \\
&= 0 + 0 \\
&= 0,
\end{align}$$

and so $u_1+u_2\in\ker T$.

Lastly, suppose that $u\in\ker T$ and $a\in\F$. Then $Tu=0$, and therefore

$$\begin{align}
T(au) &= aTu \\
&= a0 \\
&= 0,
\end{align}$$

so $au\in\ker T$. It follows that $\ker T$ is a subspace of $U$.

For some reason in linear algebra it is common to refer to the kernel of a linear transformation as its null space, and to write $\ker T$ as $\null T$. I personally find this annoying since in every other context we refer to fiber of $0$ as the kernel. But you should at least be aware that this convention exists and is prevalent in the literature.

In the same vein, the image of any linear transformation is a subspace of the codomain. Actually we've already done all the work to show this.

Theorem. If $T:U\to V$ then $\im T$ is a subspace of $V$.

Proof. Since linear maps take subspaces to subspaces, and $U$ is a subspace of itself, it follows that $\im T$ is a subspace of $V$.

Rank and Nullity

Now we will explore the interesting relation between the dimensions of the kernel and image of a linear map. It will take some time to build up to the "big result" of this section, but it is well worth the wait.

To understand this connection, we begin by considering what injective and surjective linear maps look like.

Theorem. A linear map $T:U\to V$ is injective if and only if $\ker T = \set{0}$.

Proof. Suppose first that $\ker T=\set{0}$, and suppose further that $Tu_1=Tu_2$. Then

$$\begin{align}
0 &= Tu_1 - Tu_2 \\
&= T(u_1-u_2),
\end{align}$$

and so $u_1-u_2\in\ker T$. But $\ker T$ contains only one element — the zero vector. So $u_1-u_2=0$, meaning $u_1=u_2$. It follows that $T$ is injective.

Suppose conversely that $T$ is injective. We know that $T(0)=0$. Furthermore, since $T$ is injective, at most one vector in the domain can be mapped to any vector in the codomain. This implies that $0$ is the only vector which gets mapped to zero. Hence, $\ker T=\set{0}$.

Injective linear maps have an extremely special property. Namely, they preserve linear independence!

Theorem. Let $T:U\to V$ denote an injective linear map, and suppose $(u_1,\ldots,u_n)$ is a linearly independent list of vectors in $U$. Then $(Tu_1,\ldots,Tu_n)$ is a linearly independent list of vectors in $V$.

Proof. Suppose we have scalars $a_i\in\F$ for $1\le i\le n$ such that

$$\begin{align}
\sum_{i=1}a_iTu_i &= T\left(\sum_{i=1}^n a_i u_i\right) \\
&= 0.
\end{align}$$

Since $T$ is injective, we know that $\ker T=\set{0}$ and therefore

$$\sum_{i=1}^n a_i u_i = 0.$$

Since $(u_1,\ldots,u_n)$ is linearly independent, this implies $a_i=0$ for $1\le i\le n$. Thus, $(Tu_1,\ldots,Tu_n)$ is linearly independent.

The next theorem is probably pretty obvious. It just says that if the domain of a linear map is finite-dimensional then so is its image.

Theorem. Let $T:U\to V$ denote a linear map. If $U$ is finite-dimensional, then $\im T$ is a finite-dimensional subspace of $V$.

Proof. We have already shown that $\im T$ is a subspace of $V$. To show that it is finite-dimensional, we must demonstrate that it is spanned by a finite list of vectors in $V$. Let $(e_1,\ldots,e_n)$ be any basis for $U$. We claim that

$$\im T = \span(Te_1,\ldots,Te_n).$$

To see this, choose any vector $v\in\im T$. Then $v=Tu$ for some $u\in U$. We may write $u$ as a linear combination of basis vectors

$$u = \sum_{i=1}^n a_i e_i,$$

where $a_i\in\F$ for $i\le i\le n$. Thus,

$$\begin{align}
v &= Tu \\
&= T\left(\sum_{i=1}^n a_i e_i\right) \\
&= \sum_{i=1}^n a_i Te_i.
\end{align}$$

We have expressed $v$ as a linear combination of the vectors $Te_1,\ldots,Te_n$. It follows that

$$\im T = \span(Te_1,\ldots,Te_n).$$

Since $\im T$ is spanned by a finite list of vectors, it is finite-dimensional.

We now come to the main theorem of this post, which states that the dimension of the domain of a linear map is always equal to the sum of the dimensions of its kernel and image.

Rank-Nullity Theorem. Suppose $U$ is a finite-dimensional vector space, $V$ is any vector space, and $T:U\to V$ is a linear map. Then

$$\dim U = \dim\ker T + \dim\im T.$$

Proof. Let $n=\dim\ker T$, and choose any basis $(e_1,\ldots,e_n)$ for $\ker T$. Then we can extend this list to a basis $(e_1,\ldots,e_n,f_1,\ldots,f_m)$ for $U$ by adding $m=\dim U - n$ linearly independent vectors. We argue that $\dim\im T=m$ by showing that $(Tf_1,\ldots,Tf_m)$ is a basis for $\im T$.

Certainly $\im T = \span(Tf_1,\ldots,Tf_m)$ since $\span(Te_1,\ldots,Te_n)=\set{0}$ as all the $e_i$ are in the kernel of $T$. It suffices to show then that $(Tf_1,\ldots,Tf_m)$ is linearly independent.

To this end, suppose we are given $a_i$ for $1\le i\le m$ such that

$$\sum_{i=1}^m a_i Tf_i = 0.$$

Then since $T$ is a linear map, we have that

$$T\left(\sum_{i=1}^m a_i f_i\right) = 0.$$

It follows that

$$\sum_{i=1}^m a_i f_i\in\ker T,$$

and so we can rewrite this uniquely in terms of basis vectors for $\ker T$:

$$\sum_{i=1}^m a_i f_i = \sum_{i=1}^n b_i e_i, \tag{1}$$

where $b_i\in\F$ for $i\le 1\le n$. Since $(e_1,\ldots,e_n,f_1,\ldots,f_m)$ is a basis for $U$, it is linearly independent. Thus, equation $(1)$ implies that $a_i=0$ for $1\le i\le m$ and $b_i=0$ for $1\le i\le n$. Thus, $(Tf_1,\ldots,Tf_m)$ is linearly independent and so it is a basis for $\im T$. It follows that $\dim\im T = m$, and thus

$$\dim U = \dim\ker T + \dim\im T.$$

Essentially what this means is that with any linear map, every dimension of the domain is accounted for. By this, I really mean that every one-dimensional subspace is either squashed down to zero and thus in the kernel, or its linear dependence is preserved and it gets mapped to a unique one-dimensional subspace of the codomain.

The reason this is called the rank-nullity theorem is because we make the following definitions:

Definition. Let $T:U\to V$ denote a linear map. The rank of $T$ is

$$\text{rank }T = \dim\im T.$$

Similarly, the nullity of $T$ is

$$\text{nullity }T = \dim\ker T.$$

The rank-nullity theorem has an interesting implication.

Theorem. If $U$ and $V$ are finite-dimensional vector spaces with $\dim U=\dim V$ and $T:U\to V$ is a linear map. Then $T$ is injective if and only if it is surjective.

Proof. Suppose first that $T$ is injective, so that $\ker T=\set{0}$. Then $\dim\ker T = 0$, so from the rank-nullity theorem it follows that

$$\begin{align}
\dim\im T &= \dim U \\
&= \dim V,
\end{align}$$

and thus $\im T = V$ because the only subspace of a finite-dimensional vector space whose dimension is the same as the parent space is the parent space itself. Thus, $T$ is surjective.

Suppose next that $T$ is surjective. Then

$$\begin{align}
\dim\im T &= \dim V \\
&= \dim U.
\end{align}$$

From the rank-nullity theorem, it follows that $\dim\ker T = 0$. The only subspace with dimension zero is the trivial subspace, and so $\ker T = \set{0}$. Thus, $T$ is injective.

So for two finite-dimensional vector spaces of the same dimension, injective and surjective linear maps are actually the same thing. Remember that a function that is both injective and surjective is called bijective and has an inverse function. As you shall see, all kinds of interesting results follow from this observation.