Tangent Spaces and the Pushforward

In calculus, you learn how to construct tangent lines to differentiable curves at a point.

In multivariable calculus, you learn how to construct tangent planes to differentiable surfaces at a point.

In differential geometry, the analogous concept is the tangent space to a smooth manifold at a point, but there's some subtlety to this concept. Notice how the curves and surface in the examples above are sitting in a higher-dimensional space in order to make sense of their tangent lines/plane.

The circle is a one-dimensional manifold, yet we've constructed its tangent line as an extrinsic object which lives outside the circle, and in doing so we have embedded the one-dimensional circle into a two-dimensional space. Similarly, in order to construct the tangent plane to the sphere, we've embedded it in three-dimensional space, although the sphere is a two-dimensional manifold.

This is not a huge issue for these familiar objects. But since we want to use manifolds to describe things like higher-dimensional surfaces and spacetime in general relativity, we need some way to talk about the tangent space intrinsically, without embedding our manifold in a higher dimensional space. After all, it wouldn't really make sense to think about spacetime as sitting inside of some higher-dimensional manifold.

Our construction of tangent spaces will follow this intrinsic approach, and has the benefit of being isomorphic to the parallel extrinsic approach (as we shall see later when we discuss the Whitney Embedding Theorem). Additionally, it is advantageous because of its simplicity and ease of calculation. So let's jump right into it!

The fundamental concept in defining tangent spaces is that of a smooth curve $\gamma:\R\to M$. Of course, we haven't actually defined what it means for such a curve to be smooth yet, but we can easily borrow from our experience in defining smooth functions to accomplish this.

Definition. Given a smooth manifold $M$, a curve $\gamma:\R\to M$ is a smooth curve if for every point $p\in \gamma[\R]\subseteq M$ there is a chart $(U, x)$ whose domain contains $p$ and for which $x\circ\gamma$ is smooth in the sense of ordinary differential calculus.

Of course, we need to justify that this is a well-defined notion, since we defined it in terms of charts. But since chart transition maps on a smooth manifold are smooth, the proof of this fact looks almost identical to the analogous arguments from the last post, so I will not give it here.

In regular multivariable calculus, we have the idea of vectors based at a point and gradients of functions, and we form directional derivatives by combining vectors and gradients using an inner product. In differential geometry, there is no separating the concepts. Vectors are directional derivatives, plain and simple.

The key idea in defining tangent vectors is the following: We can use smooth curves as measurement devices to determine the rate at which a smooth function is changing as we move along our manifold. Here's how.

Definition. Let $M$ be a smooth manifold, $\gamma:\R\to M$ a smooth curve and $(U, x)$ a chart whose domain contains the point $p\in\gamma[\R]$. The directional derivative along the curve $\gamma$ at the point $p$ is the map $$V_{\gamma, p}:C^\infty(M)\to\R$$ defined by $$V_{\gamma, p}f = (f\circ\gamma)'\at{\gamma^{-1}(p)}$$ for any smooth function $f\in C^\infty(M)$.

To clarify, this is just the regular derivative of the function $f\circ\gamma:\R\to\R$ taken at the point $\gamma^{-1}(p)$, which is of course a real number.

So these directional derivatives are maps which act on smooth functions and spit out real numbers. To see how these directional derivatives behave as measurement devices, as I claimed above, it is useful to look at level sets of the function $f$. A level set is just the set of points in our manifold $M$ which our function $f$ maps to the same value. To continue my analogy from last post, if $f$ describes the temperature of points on the manifold, a level set of $f$ is a set of points where the temperature is the same. Some level set of such a function might look something like this:

Since the values of our function are constant along level sets, we would expect the directional derivative along these level sets of our function to be zero. To be more precise, suppose $f:M\to\R$ is a smooth function and $L=\{p\in M\mid f(p) = k\}$ is a level set of $f$ where $k\in\R$ is just some constant. Now suppose $\gamma:\R\to M$ is a smooth curve whose image lies in the level set $L$, i.e., $\gamma[\R]\subseteq L$. Then by the definition of $L$, we have that $f(\gamma(x))=k$ for every $x\in\R$. That is, $f\circ\gamma:\R\to\R$ is a constant function, and so its derivative $(f\circ\gamma)'$ is zero everywhere. Thus, the directional derivative $V_{\gamma, p}f$ is zero at every point $p\in\gamma[\R]$.

Suppose instead we have a curve $\sigma:\R\to M$ along which the value of our smooth function $f$ is increasing. That is, if $x,y\in\R$ with $x<y$, then $(f\circ\sigma)(x)<(f\circ\sigma)(y)$. Then $(f\circ\sigma)'>0$ for every $x\in\R$, and so the directional derivative $V_{\sigma, p}f$ is positive at every point $p\in\sigma[\R]$.

A very nice property of these directional derivatives is that they are linear maps!

Theorem. Let $M$ be a smooth manifold, $\gamma:\R\to M$ a smooth curve and $(U, x)$ a chart whose domain contains the point $p\in\gamma[\R]$. Then the directional derivative $V_{\gamma,p}:C^\infty(M)\to\R$ is a linear map.

Proof. We will show first that $V_{\gamma,p}$ is additive. Suppose that $f,g:M\to\R$ are smooth functions. Then

$$\begin{align}
V_{\gamma, p}(f + g) &= ((f+g)\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= (f\circ\gamma + g\circ\gamma)'\at{\gamma^{-1}(p)} \hint{distributive property}\\
&= (f\circ\gamma)'\at{\gamma^{-1}(p)} + (g\circ\gamma)'\at{\gamma^{-1}(p)} \hint{linearity of derivative} \\
&= V_{\gamma, p}f + V_{\gamma, p}g.
\end{align}$$

Next we will show that $V_{\gamma,p}$ is homogeneous. Suppose $f:M\to\R$ is a smooth function and $a\in\R$. Then

$$\begin{align}
V_{\gamma, p}(af) &= (af\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= (a(f\circ\gamma))'\at{\gamma^{-1}(p)} \hint{distributive property}\\
&= a(f\circ\gamma)'\at{\gamma^{-1}(p)} \hint{linearity of derivative} \\
&= aV_{\gamma, p}f.
\end{align}$$

Thus, the directional derivative is a linear map.

It turns out that, in addition to being a linear map, directional derivatives are also something called derivations. This simply means they obey a "product rule"

$$
\begin{align}
V_{\gamma, p}(fg) &= (fg\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= ((f\circ\gamma)(g\circ\gamma))'\at{\gamma^{-1}(p)} \\
&= (f\circ\gamma)'\at{\gamma^{-1}(p)}(g\circ\gamma)\at{\gamma^{-1}(p)} + (f\circ\gamma)\at{\gamma^{-1}(p)}(g\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= (V_{\gamma, p}f)g(p) + f(p)V_{\gamma, p}(g)
\end{align}
$$

which is inherited from the product rule for ordinary derivatives. We are now equipped to define the tangent space to a manifold at a point!

Definition. Let $M$ be a smooth manifold with $p\in M$. The tangent space to $M$ at the point $p$, written $T_p M$, is the vector space of all directional derivatives along curves at $p$.

We have called the tangent space a vector space without proving that it is one. Since directional derivatives are functions $C^\infty(M)\to\R$, we can define vector addition as function addition and scalar multiplication as function multiplication. While these resulting objects are definitely functions, we have to show that they are directional derivatives. Otherwise our tangent space would not be closed under addition or multiplication, which is not allowed!

Theorem. The tangent space $T_p M$ of a smooth manifold at a point $p\in M$ is a vector space.

Proof. We will show first that $T_p M$ is closed under vector addition. That is, we must show that

$$\begin{align}
(V_{\gamma,p}+V_{\sigma,p})f &= V_{\gamma,p}f+V_{\sigma,p}f\\
&\in T_p M
\end{align}$$

for any smooth curves $\gamma,\sigma:\R\to M$. To do this, we must find a curve $\psi:\R\to M$ whose directional derivative $V_{\psi, p}$ is the sum of the directional derivatives along $\gamma$ and $\sigma$.

To construct such a curve, we need to choose a chart $(U, x)$ whose domain contains $p$, and look at the chart representatives $x\circ\gamma:\R\to\R^n$ and $x\circ\sigma:\R\to\R^n$, where $n=\dim M$. Addition of these representatives makes sense in our chart. However, since we are going to define our curve $\psi$ in terms of a chart, we need to make sure that our definition is independent of our choice of chart! We keep this in the back of our minds, as this independence will follow quite naturally.

If we add the two chart representatives $x\circ\gamma$ and $x\circ\sigma$ we will not end up with something which passes through $x(p)$ (unless $x(p)$ happens to be zero). To remedy this situation, we seek to define a new curve $\psi$ whose chart representative $x\circ\psi$ is

$$x\circ\gamma + x\circ\sigma - x(p).$$

This will guarantee that $x\circ\psi$ passes through $x(p)$. To construct this curve $\psi$, we simply take our desired chart representative and apply $x^{-1}$.

$$\psi = x^{-1}(x\circ\gamma + x\circ\sigma - x(p)).$$

Now we must show that $\psi$ really is the curve whose directional derivative is the sum of the directional derivatives of $\gamma$ and $\sigma$. That is, we need to demonstrate that $V_{\psi,p} = V_{\gamma,p} + V_{\sigma,p}$. This is quite straightforward. For any smooth function $f:\R\to M$,

$$
\begin{align}
V_{\psi,p}f &= (f\circ\psi)'\at{\psi^{-1}(p)} \hint{definition} \\
&= (f\circ(x^{-1}\circ x)\circ\psi)'\at{\psi^{-1}(p)} \hint{identity} \\
&= (x\circ\psi)'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x\circ\psi\circ\psi^{-1}(p)} \hint{chain rule} \\
&=(x\circ\psi)'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{identity} \\
&=(x\circ\gamma + x\circ\sigma - x(p))'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{definition of $\psi$} \\
&=(x\circ\gamma + x\circ\sigma)'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{derivative of constant} \\
&= (x\circ\gamma)'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x(p)} + (x\circ\sigma)'\at{\psi^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{linearity of derivative} \\
&= (x\circ\gamma)'\at{\gamma^{-1}(p)}(f\circ x^{-1})'\at{x(p)} + (x\circ\sigma)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{$\psi^{-1}(p) = \gamma^{-1}(p) = \sigma^{-1}(p)$} \\
&= (f\circ(x^{-1}\circ x)\circ\gamma)'\at{\gamma^{-1}(p)} + (f\circ(x^{-1}\circ x)\circ\sigma)'\at{\sigma^{-1}(p)} \hint{chain rule} \\
&= (f\circ\gamma)'\at{\gamma^{-1}(p)} + (f\circ\sigma)'\at{\sigma^{-1}(p)} \hint{identity} \\
&= V_{\gamma,p}f + V_{\sigma,p}f. \hint{definition}
\end{align}
$$

It follows that $V_{\psi,p}=V_{\gamma,p} + V_{\sigma,p}$. Notice how the chart maps completely canceled out, and so even though our definition of $\psi$ was chart-dependent, its directional derivative was not! We have thus shown that $T_p M$ is closed under vector addition.

The proof that it is closed under scalar multiplication is very similar and slightly simpler. Suppose again that $(U, x)$ is a chart whose domain contains $p$, and consider the chart representative $x\circ\gamma:\R\to\R^n$. Scalar multiplication makes sense in our chart, but if we try to construct a curve whose chart representative is just $k(x\circ\gamma)$ we hit a similar problem to the above, where the image of this curve's chart representative will not pass through $x(p)$ unless $k=1$ or $x(p)=0$. We will instead define a new curve $\sigma$ whose chart representative is

$$x\circ\sigma = k(x\circ\gamma) - (k-1)x(p).$$

Note that this guarantees that the image of $x\circ\sigma$ passes through $x(p)$. To construct the curve $\sigma$ from its chart representative, we simply apply $x^{-1}$ as we did before.

$$\sigma = x^{-1}\big(k(x\circ\gamma)-(k-1)x(p)\big).$$

We must demonstrate that $V_{\sigma, p} = kV_{\gamma, p}$. Technically, we must also show that this directional derivative is independent of our chosen chart, but this will end up becoming irrelevant just as it did above. Choose any smooth function $f:\R\to M$. Then

$$
\begin{align}
V_{\sigma,p}f &= (f\circ\sigma)'\at{\sigma^{-1}(p)} \hint{definition} \\
&= (f\circ(x^{-1}\circ x)\circ\sigma)'\at{\sigma^{-1}(p)} \hint{identity} \\
&= (x\circ\sigma)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x\circ\sigma\circ\sigma^{-1}(p)} \hint{chain rule} \\
&=(x\circ\sigma)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{identity} \\
&=\big(k(x\circ\gamma) - (k-1)x(p)\big)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{definition of $\sigma$} \\
&=\big(k(x\circ\gamma)\big)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x(p)} \hint{derivative of constant} \\
&= k(x\circ\gamma)'\at{\sigma^{-1}(p)}(f\circ x^{-1})'\at{x(p)}\hint{linearity of derivative} \\
&= k(x\circ\gamma)'\at{\gamma^{-1}(p)}(f\circ x^{-1})'\at{x(p)}\hint{$\sigma^{-1}(p)=\gamma^{-1}(p)$} \\
&= k(x\circ\gamma)'\at{\gamma^{-1}(p)}(f\circ x^{-1})'\at{x\circ\gamma\circ\gamma^{-1}(p)} \hint{identity} \\
&= k(f\circ\gamma)'\at{\gamma^{-1}(p)} \hint{chain rule} \\
&= kV_{\gamma,p}f.
\end{align}$$

It follows that $V_{\sigma,p}=kV_{\gamma,p}$, and thus $T_p M$ is closed under scalar multiplication, completing the proof.

Now that we are confident that $T_p M$ is a vector space, we will normally refer to its elements as tangent vectors. Interestingly, it is usually the case that the curve along which the tangent vector is defined is not important. Let's explore why. Suppose $\gamma$ is a curve, $f$ is a smooth function and $p\in M$ is a point on our curve. Then as before,

$$\begin{align}
V_{\gamma, p}f &= (f\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= (f\circ (x^{-1}\circ x)\circ\gamma)'\at{\gamma^{-1}(p)} \\
&= ((f\circ x^{-1})\circ (x\circ\gamma))'\at{\gamma^{-1}(p)}.
\end{align}$$

To make our lives easier, let's make the temporary substitutions
$$\begin{align}
F &= f\circ x^{-1}, \\
\Gamma &= x\circ\gamma.
\end{align}$$

Now $F:\R^n\to\R$ and $\Gamma:\R\to\R^n$, so we can use the multivariable chain rule to obtain

$$\begin{align}
V_{\gamma, p}f &= (F\circ\Gamma)'\at{\gamma^{-1}(p)} \\
&= \sum_{i=1}^n\Bigg(\partial_i\at{x(p)}F\Bigg)\Bigg((\Gamma^{i})'\at{\gamma^{-1}(p)}\Bigg) \\
&= \sum_{i=1}^n\Bigg(\partial_i\at{x(p)}(f\circ x^{-1})\Bigg)\Bigg((x^i\circ\gamma)'\at{\gamma^{-1}(p)}\Bigg),
\end{align}$$

where $\partial_i$ represents the regular partial derivative with respect to the $i$th component and $\Gamma^i$ and $x^i$ are the $i$th component functions of $\Gamma$ and $x$, respectively.

Now, this certainly doesn't look any nicer than $V_{\gamma, p}f$, so why have we done this? Because we have managed to separate the definition into two chunks, one of which depends on the curve $\gamma$ and one which doesn't. The part which doesn't, $\partial_i\at{x(p)}(f\circ x^{-1})$, is so special that it deserves its own name.

Definition. Let $M$ be a smooth manifold, $f\in C^\infty(M)$ a smooth function, and $(U,x)$ a chart containing $p\in M$. Then the $i$th partial derivative operator with respect to the chart map $x$ is the map: $$\frac{\partial}{\partial x^i}\at{p}:C^\infty(M)\to\R$$ defined by $$\frac{\partial}{\partial x^i}\at{p}f=\partial_i(f\circ x^{-1})\at{x(p)}.$$

With this new definition, we see that

$$V_{\gamma, p}f = \sum_{i=1}^n V^i \frac{\partial}{\partial x^i}\at{p}f$$

where the $V^i\in\R$ are the "curve-dependent" parts which we don't care about, and we can think of them as arbitrary scalars. This means that the partial derivative operators $\Big(\dfrac{\partial}{\partial x^i}\at{p}\Big)_{i=1}^n$ span $T_p M$. Furthermore, it is easy to show that they are linearly independent. It follows that $\Big(\dfrac{\partial}{\partial x^i}\at{p}\Big)_{i=1}^n$ is a basis for the tangent space $T_p M$. From this, we see that $\dim T_p M = \dim M$. That is, the dimension of the tangent space is the same as the dimension of the original manifold!

The pushforward gives us a way to take smooth maps between manifolds and translate them into maps between tangent spaces on those manifolds. Here's the definition:

Definition. Let $M$ and $N$ be smooth manifolds with $p\in M$, and let $\phi:M\to N$ be a smooth map. The pushforward of $\phi$ is the map

$$\phi_*:T_p M\to T_{\phi(p)} N$$

defined by

$$(\phi_*X)f=X(f\circ\phi)$$

for any tangent vector $X\in T_p M$ and any smooth function $f\in C^\infty(N)$.

Let's try to make sense of this definition. The pushforward turns the smooth map $\phi$ into a map $\phi_*$ between $T_p M$ and $T_{\phi(p)} N$, so it takes a tangent vector in $M$ based at the point $p$ to a tangent vector in $N$ based at the point $\phi(p)$. But tangent vectors are defined by how they act on smooth functions, so we defined the pushforward by how the tangent vector $\phi_*X\in T_{\phi(p)} N$ acts on smooth functions $f\in C^\infty(N)$.

Since the pushforward is a map between vector spaces, it makes sense to ask whether it is a linear map. The answer is yes!

Theorem. The pushforward is a linear map.

Proof. Let $M$ and $N$ be smooth manifolds with $p\in M$, and let $\phi:M\to N$ be a smooth map. We argue first that the pushforward $\phi_*$ is additive. So suppose $X$ and $Y$ are tangent vectors in $T_pM$. Then

$$\begin{align}
\phi_*(X+Y)f &= (X+Y)(f\circ\phi) \\
&= X(f\circ\phi) + Y(f\circ\phi) \\
&= (\phi_*X)f + (\phi_*Y)f
\end{align}$$

for any smooth function $f:N\to\R$. Next, we argue that $\phi_*$ is homogeneous. So suppose $X$ is a tangent vector in $T_pM$ and $a\in\R$. Then

$$\begin{align}
\phi_*(aX)f &= (aX)(f\circ\phi) \\
&= aX(f\circ\phi)\\
&= a(\phi_*X)f.
\end{align}$$

Not only is the pushforward a linear map, it also behaves exactly as we would like it to on tangent vectors. That is, if $V_{\gamma, p}$ is a directional derivative operator (tangent vector) along a curve $\gamma$ then the pushforward $\phi_*$ takes this to the directional derivative operator along the curve $\phi\circ\gamma$ in $N$.

The following theorem verifies that this is actually what happens.

Theorem. Let $M$ and $N$ be smooth manifolds with $p\in M$, and let $\phi:M\to N$ be a smooth map. Then

$$\phi_* V_{\gamma,p} = V_{\phi\circ\gamma, \phi(p)}.$$

Proof. The result follows from the following computation:

$$\begin{align}
(\phi_* V_{\gamma,p})f &= V_{\gamma,p}(f\circ\phi) \\
&= \big((f\circ\phi)\circ\gamma\big)'\at{\gamma^{-1}(p)} \\
&= \big(f\circ(\phi\circ\gamma)\big)'\at{\gamma^{-1}(p)} \\
&= \big(f\circ(\phi\circ\gamma)\big)'\at{(\gamma^{-1}\circ\phi^{-1})(\phi(p))} \\
&= \big(f\circ(\phi\circ\gamma)\big)'\at{(\phi\circ\gamma)^{-1}(\phi(p))} \\
&= V_{\phi\circ\gamma, \phi(p)}f.
\end{align}$$

This post is only half of the picture. Next time I'll talk about the cotangent space and the pullback!