More than Infinitesimal: What is “dx”?

November 3, 2014 by . 11 comments

Problem

Many people have asked this question, and many will continue to do so. It is the natural question of someone first learning the subject of calculus: what is “\(\mathrm{d}x\)”, and why is it everywhere in calculus?

Frankly, it’s mostly Leibniz’s fault. Leibniz, a brilliant philosopher and mathematician who may or may not have invented calculus depending on whom you ask, introduced the notation. His view of derivatives was as the ratio of related infinitesimals. In slightly more modern terms, $$\lim_{\Delta x\rightarrow 0}\frac{\Delta y}{\Delta x}=\frac{dy}{dx}.$$ Unfortunately, people have carried this view for an unhealthily long amount of time. A branch of analysis known as “nonstandard analysis” found a clever and eponymously nonstandard way to make the idea of the infinitesimal rigorous. Still others have questioned the validity of the law of the excluded middle, which says that a proposition must be either true or false. By not accepting this law, some finagling and rather nonstandard logic can bring about the idea of an infinitesimal. The list goes on. In short, there is a myriad of “nonstandard” ways to realize \(\mathrm{d}x\).

This brings us to a much more refined question: is there a standard way to define \(\mathrm{d}x\)?

We will defend the claim that the answer is a resounding “yes.” What’s more, we will attempt to demonstrate that the concept is intuitive and natural, even to those relatively new to the subject of analysis.


Intuition-based Introduction

In high school physics and precalculus, we are told that a vector is a magnitude accompanied by a direction. Although this is wonderful for the purposes of introduction, we aren’t fools. Position is not mentioned anywhere within this description of a vector. Thus, conceivably, we could push a large crate over a frictionless surface and have the force vector end up around the Mars rover. We need position to realistically use vectors.

With the additional datum of position, vectors become tangent vectors. If \(v\) is a vector and \(p\) is a point, then we will use \(v_p\) will denote the tangent vector corresponding to \(v\) at the point \(p\). The reason they are called tangent vectors comes from their use in differential geometry, where a vast, powerful generalization of lines, planes, and other “Euclidean” spaces, called a “(smooth) manifold,” is used. A circle is an example of such a space, and the tangent line to a point on the circle can be thought of as the set of all tangent vectors to the circle at that point. For the sake of intuition, we will stay within Euclidean space (\(\mathbb{R}^n\) for some nonnegative integer \(n\)) for this section.

Suppose \(f:\mathbb{R}^2\to\mathbb{R}\) is a smooth function from the real plane to the real line. At each point \(p\in\mathbb{R}^2\), we can take the directional derivative of \(f\) in whatever direction we like. For example, if we have coordinates \((x,y)\), then \(\left.\frac{\partial f}{\partial x}\right|_p\) is the directional derivative at \(p\) of \(f\) in the direction of increasing \(x\). If we wanted to take the directional derivative at \(p\) of \(f\) in the direction of decreasing \(x\), then we would just negate to obtain  \(-\left.\frac{\partial f}{\partial x}\right|_p\).

In fact, if we pick any tangent vector at \(p\), it will correspond to a directional derivative at \(p\) in the direction and with the corresponding magnitude of that tangent vector. Following the above example, taking the directional derivative at \(p\) in the direction of increasing \(x\) is the same as applying \(\left.\frac{\partial}{\partial x}\right|_p\), and going the opposite direction is the same as applying \(-\left.\frac{\partial}{\partial x}\right|_p\). Taking this a step further, we claim that tangent vectors and directional derivatives are the same thing! This may seem shocking, but all it says is that every tangent vector corresponds to a directional derivative with the same magnitude and direction. If we are working in the plane, then we can therefore identify \(\left.\frac{\partial}{\partial x}\right|_p\) with \(\begin{bmatrix}1 & 0\end{bmatrix}^T_p\) and \(\left.\frac{\partial}{\partial y}\right|_p\) with \(\begin{bmatrix} 0 & 1\end{bmatrix}^T_p\), where \(v^T\) denotes the transpose of a vector \(v\).

We use the transpose to use column vectors for tangent vectors. By doing this, we can let row vectors act on them by left multiplication. Note that left multiplying a column vector by a row vector is the same as taking the dot product between those two vectors. Thus, row vectors can be viewed as linear functions from column vectors to the space of real numbers. Attaching these linear functionals to a point as we did with tangent vectors, we obtain cotangent vectors, or covectors. Hinting at the direction of conversation, in the plane we will denote the covector \(\alpha_p\) that satisfies \(\alpha_p\left(\left.\frac{\partial}{\partial x}\right|_p\right)=1\) and \(\alpha_p\left(\left.\frac{\partial}{\partial y}\right|_p\right)=0\) by \(\mathrm{d}x_p\).

The concept of a vector field from multivariate calculus is best seen in the context of tangent vectors. Using our terminology, a vector field is simply a function that takes in a point in \(\mathbb{R}^n\) and outputs a tangent vector at that point. Similarly, we define a differential \(1\)-form as a function that takes in a point in \(\mathbb{R}^n\) and outputs a covector at that point. For example, the map $$\mathrm{d}x:p\mapsto\mathrm{d}x_p$$ is a differential \(1\)-form. Thus, \(\mathrm{d}x\) is simply the differential \(1\)-form that takes each point \(p\) of the space to the cotangent vector \(\mathrm{d}x_p\) at \(p\) that satisfies \(\mathrm{d}x_p\left(\left.\frac{\partial}{\partial x}\right|_p\right)=1\) and \(\mathrm{d}x_p(v_p)=0\) for all \(v_p\not\in\left<\left.\frac{\partial}{\partial x}\right|_p\right>\).

How could these possibly be interpreted as infinitesimals? I think Spivak explains this best:

“Classical differential geometers (and classical analysts) did not hesitate to talk about ‘infinitely small’ changes \(\mathrm{d}x^i\) or the coordinates \(x^i\), just as Leibnitz (sic) had. No one wanted to admit that this was nonsense, because true results were obtained when these infinitely small quantities were divided into each other (provided one did it in the right way).

“Eventually it was realized that the closest one can come to describing an infinitely small change is to describe a direction in which this change is supposed to occur, i.e., a tangent vector. Since \(\mathrm{d}f\) is supposed to be an infinitesimal change of \(f\) under an infinitesimal change of the point, \(\mathrm{d}f\) must be a function of this change, which means that \(\mathrm{d}f\) should be a function on tangent vectors. The \(\mathrm{d}x^i\) themselves then metamorphosed into functions, and it became clear that they must be distinguished from the tangent vectors \(\partial/\partial x^i\).” [1]


Interlude: Differential k-forms

The term differential \(1\)-form ominously implies it is part of a larger structure. In this section, we will provide blatantly formal definitions, then provide intuition. The formal subsection is designed to show that differential geometry is fundamentally different from your calculus homework (i.e. Please stop using the (differential-geometry) tag for questions that don’t apply to that level of generality). The formal subsection should, therefore, be skipped at the first hint of confusion.

Formal

Let \(M\) be a smooth manifold. We shall denote the space of smooth sections on the vector bundle \(\xi\) by \(\Gamma(\xi)\). Define \(\Gamma(\Lambda\,T^\vee M)\) to be the quotient algebra of the free algebra on \(\Gamma(T^\vee M)\) by the ideal generated by elements of the form \(\alpha\otimes\alpha\). In other words, $$\Gamma(\Lambda\,T^\vee M)=\left(\bigoplus_{k=0}^\infty(\Gamma(T^\vee M))^{\otimes k}\right)/\left<\alpha\otimes\alpha\vert \alpha\in\Gamma(T^\vee M)\right>.$$ We shall denote the product in \(\Gamma(\Lambda\,T^\vee M)\) by \(\wedge\).

We may now define differential forms as elements of \(\Gamma(\Lambda\,T^\vee M)\), and a differential \(k\)-form is simply an element of the subalgebra $$\Gamma(\Lambda^k\,T^\vee M)=(\Gamma(T^\vee M))^{\otimes k}/(\left<\alpha\otimes\alpha\vert \alpha\in\Gamma(T^\vee M)\right>\cap (\Gamma(T^\vee M))^{\otimes k}).$$ It should be noted that \(\Gamma(\Lambda^0\,T^\vee M)=C^\infty(M)\).

We define the exterior derivative \(\mathrm{d}\) as the unique \(\mathbb{R}\)-linear map \(\mathrm{d}:\Gamma(\Lambda\,T^\vee M)\to\Gamma(\Lambda\,T^\vee M)\) such that

  • If \(\alpha\) is a \(k\)-form, then \(\mathrm{d}\alpha\) is a \((k+1)\)-form.
  • If \(f\in C^\infty(M)\), then \(\mathrm{d}f_p(X_p)=X_p(f)\) for all tangent vectors \(X_p\in T_pM\).
  • If \(\alpha\) is a \(k\)-form and \(\beta\) is another differential form, then \(\mathrm{d}(\alpha\wedge\beta)=\mathrm{d}\alpha\wedge\beta+(-1)^k\alpha\wedge\mathrm{d}\beta\).
  • For all differential forms \(\alpha\), \(\mathrm{d}(\mathrm{d}\alpha)=0\).

Informal

Let’s take this step by step, and stay in \(\mathbb{R}^n\). We introduce a “wedge product” \(\wedge\) to extend the space of differential \(1\)-forms. A \(0\)-form is just a smooth function, and for \(f\in C^\infty(\mathbb{R}^n)\) and a differential form \(\alpha\), \(f\wedge\alpha=f\alpha\). If \(\alpha\) and \(\beta\) are distinct \(1\)-forms, then \(\alpha\wedge\beta\) is a \(2\)-form. If \(\gamma\) is yet another distinct \(1\)-form, then \(\alpha\wedge\beta\wedge\gamma\) is a \(3\)-form. If {\({\alpha^i}\)} is a set of \(k\) distinct \(1\)-forms, then \(\alpha^1\wedge\alpha^2\wedge\cdots\wedge\alpha^k\) is a \(k\)-form. However, \(0\wedge\alpha=0\) and \(\alpha\wedge\alpha=0\) for all \(1\)-forms \(\alpha\).

Differential \(k\)-forms generalize \(1\)-forms by taking in \(k\) tangent vectors instead of only \(1\). For example, in the plane with coordinates \((x,y)\), \((\mathrm{d}x\wedge\mathrm{d}y)_p(v_p,w_p)=\frac{1}{2}(\mathrm{d}x_p(v_p)\mathrm{d}y_p(w_p)-\mathrm{d}x_p(w_p)\mathrm{d}y_p(v_p))\).

In addition to this, there is a linear map \(\mathrm{d}\) that takes \(k\)-forms to \((k+1)\)-forms. If the coordinates are given by \((x^1,\,\dots\,,x^n)\) and $$\alpha=\sum_{1\leq i_1<\cdots<i_k\leq n}f_{i_1\cdots i_k}\mathrm{d}x^{i_1}\wedge\cdots\wedge\mathrm{d}x^{i_k}=\sum_If_I\mathrm{d}x^I$$ is a \(k\)-form, then $$\mathrm{d}\alpha=\sum_{j=1}^n\sum_I\frac{\partial f_I}{\partial x^j}\mathrm{d}x^j\wedge\mathrm{d}x^I.$$


Why would this be the natural choice?

We have claimed that this interpretation is incredibly natural to our approach to calculus. While it is implausible to mathematically prove this subjective viewpoint, providing a few examples of how it works will invariably display the beauty of the concept.

Indefinite Integration

It is tempting to simply write that \(\int\mathrm{d}\alpha=\alpha\), but this is not well defined. Denote the space of differential \(k\)-forms on \(\mathbb{R}^n\) by \(\Omega^k(\mathbb{R}^n)\). If \(\alpha\in\Omega^{k+1}(\mathbb{R}^n)\) and \(\alpha=\mathrm{d}\beta\) for some \(\beta\in\Omega^k(\mathbb{R}^n)\), then \(\alpha=\mathrm{d}(\beta+\mathrm{d}\gamma)\) as well. Thus, there is no unique \(k\)-form mapping to \(\alpha\).

Define a \(k\)-form \(\alpha\) to be exact if and only if \(\alpha=\mathrm{d}\beta\) for some \(\beta\), and say \(\alpha\in\mathrm{d}(\Omega^{k-1}(\mathbb{R}^n))\). Similarly, define a \(k\)-form \(\alpha\) to be closed if and only if \(\mathrm{d}\alpha=0\), and say \(\alpha\in\Omega^k_\text{Closed}(\mathbb{R}^n)\). Using these, we may now define $$\int:\mathrm{d}(\Omega^{k}(\mathbb{R}^n))\to\Omega^k(\mathbb{R}^n)/\Omega^{k}_\text{Closed}(\mathbb{R}^n),$$

$$\mathrm{d}\alpha\mapsto\alpha+\Omega^k_\text{Closed}(\mathbb{R}^n).$$ Thus, \(\int\) maps exact forms \(\mathrm{d}\alpha\) back to the coset of \(\Omega^k_\text{Closed}(\mathbb{R}^n)\) determined by \(\alpha\). In less formal terms, this means that \(\int\) takes the exact form \(\mathrm{d}\alpha\) to the collection \(\begin{Bmatrix}\alpha + \sigma : \mathrm{d}\sigma=0\end{Bmatrix}\). (For you sheafites, this can easily be seen as a map between presheaves.)

Does this work with the indefinite integral we already use? Let \(k=0\) and \(n=1\). For \(\mathrm{d}f=f’\,\mathrm{d}x\), $$\int\mathrm{d}f=\int f’\,\mathrm{d}x=f+\Omega^0_\text{Closed}(\mathbb{R})\ni f+c,$$ where \(c\) is an “arbitrary constant of integration.” Thus, the indefinite integral arises naturally from this choice of perspective.

Differentiation

With this interpretation, we can define differentiation as simply the application of a tangent vector or vector field to a smooth function. Though this looks rather snazzy at first glance, it really just says that differentiation is defined as usual; we have only changed the context. For the sake of entertainment, we shall introduce a generalization.

Suppose \(\phi\) is a smooth bijection with a smooth inverse, \(\psi\) is a smooth map, and \(f\) be a smooth, real-valued function. We define the pushforward \(\psi _ \ast X_p\) of a tangent vector \(X_p\) at \(p\) by \((\psi _ \ast X_p)(f)=X_{\psi(p)}(f\circ\psi)\) and the pushforward \(\phi_\ast X\) of a vector field \(X\) by \(\phi\) by $$(\phi_\ast X)_p(f)=X _ {\phi^{-1}(p)}(f\circ\phi).$$ For a differential \(k\)-form \(\alpha\), we define the pullback \(\psi^\ast\alpha\) of \(\alpha\) by \(\psi\) by $$(\psi ^ \ast \alpha)_p(X_1(p),\,\dots\,,X_k(p))=\alpha _ {\psi(p)}(\psi _ \ast (X_1(p)),\,\dots\,,\psi _ \ast (X_k(p))).$$

If \(X\) is a vector field defined on \(\mathbb{R}^n\) and \(p\in\mathbb{R}^n\), then there exists an open neighborhood \(U\subseteq\mathbb{R}^n\) of \(p\) and an \(\varepsilon>0\) such that there is a unique \(\phi^X:(-\varepsilon,\varepsilon)\times U\to\mathbb{R}^n,~(t,q)\mapsto\phi^X_t(q)\) such that

  • \(\phi^X\) is smooth.
  • For fixed \(t\), \(\phi^X_t:U\to\phi^X_t(U)\) is a smooth bijection with a smooth inverse.
  • If \(|a|<\varepsilon\), \(|b|<\varepsilon\), and \(|a+b|<\varepsilon\), then \(\phi^X_{a+b}=\phi^X_a\circ\phi^X_b\).
  • Let \(\psi_q:t\mapsto\phi_t(q)\). For all \(q\in U\), \((\psi_q) _ \ast\!\left(\left.\frac{\partial}{\partial t}\right|_0\right)=X_q\).

We call \(\phi^X\) the local flow of \(X\).

For any differential form \(\alpha\), we can differentiate it with respect to a vector field \(X\) using the Lie derivative: $$\mathcal{L} _ X\alpha=\lim _ {\delta\rightarrow 0}\frac{1}{\delta}\hspace{-.5em}\left((\phi^X _ \delta) ^ \ast \alpha-\alpha\right).$$

For \(0\)-forms \(f\), this is simply \(\mathcal{L} _ Xf=X(f)\). Again, this setting is the natural playground of calculus.

Definite Integration

Define a singular \(k\)-cube to be a smooth map (as a map from a manifold with boundary) \(c:[0,1]^k\to\mathbb{R}^n\). For example, a path between two points is a singular \(1\)-cube.

There is plenty we can do with these singular cubes, but we need to be able to add them. Thus, we say define finite formal sums of \(k\)-cubes as \(k\)-chains. This lets us add cubes together. If we need to visualize this, we can just think of passing through one cube, then passing through the other.

This notion of addition gives us a bit more leeway with what we can do. For example, we can define the boundary \(\partial c\) of a \(k\)-cube \(c\) by $$\sum_{i=1}^k(-1)^i(c_{(i,0)}-c_{(i,1)}),$$ where \(c_{(i,j)}:(x^1,\,\dots\,,x^{k-1})\mapsto c(x^1,\,\dots\,,x^{i-1},j,x^{i},\,\dots\,,x^{k-1})\). We can also extend this to chains, using \(\partial\hspace{-.3em}\left(\sum_r a_rc_r\right)=\sum_ra_r\partial(c_r)\). This essentially gives the oriented boundary of the singular cube, and by intuition we can see that \(\partial(\partial c)=0\) for any chain \(c\).

For coordinates \((x^1,\,\dots\,,x^k)\) on \([0,1]^k\), a \(k\)-form \(\alpha\), a singular \(k\)-chain \(c=\sum_{r}a_rc_r\), a partition \(P=(t_{i_1},\,\dots\,, t_{i_k})\) of \([0,1]^k\), with \(0\leq t_{i_j}\leq 1\) and \(1\leq i_j\leq l_j\), and a collection \(\rho_P\) of \(\rho_{i_1\cdots i_k}\in[t_{i_1-1},t_{i_1}]\times\cdots\times[t_{i_k-1},t_{i_k}]\), we define $$S(c,\alpha,P,\rho _ P)=\sum_{r}\left(a_r\hspace{-.5em}\sum _ {\small{i _ 1,\dots, i _ k}}(c _ r ^ \ast \alpha) _ {\rho _ {i _ 1\cdots i _ k}}\hspace{-.5em}\left(\left.\frac{\partial}{\partial x^1}\right| _ {\rho _ {i _ 1\cdots i _ k}}\hspace{-1.5em},\,\dots\,,\left.\frac{\partial}{\partial x^k}\right| _ {\rho _ {i _ 1\cdots i _ k}}\right)(t _ {i _ 1}\hspace{-.5em}-t _ {i _ 1-1})\cdots(t _ {i _ k}\hspace{-.5em}-t _ {i _ k-1})\right).$$

With this, we define $$\int\limits_{c}\alpha=\lim_{\small{l_1,\dots,l_k\rightarrow\infty}}S(c,\alpha,P,\rho_P).$$

This, admittedly, looks awful. However, it already agrees with our original definitions. If \(c:[0,1]\to\mathbb{R},~t\mapsto (b-a)t+a\) and \(\alpha=\mathrm{d}f\), then we can pick \(t_i=\frac{i}{n}\), \(0\leq i\leq n\), to get $$\int\limits _ c\mathrm{d}f=\lim _ {n\rightarrow\infty}\sum _ {i=0} ^ {n-1}(c ^ \ast\mathrm{d}f) _ {t _ i}\left(\left.\frac{\partial}{\partial x}\right| _ {t _ i}\right)(t _ {i+1}-t _ i)=\lim _ {n\rightarrow\infty}\sum _ {i=0} ^ {n-1}c'(t _ i)f'(c(t _ i))(t _ {i+1}-t _ i).$$ Since \(c'(t_i)=b-a\) and \(t_{i+1}-t_i=\frac{i+1}{n}-\frac{i}{n}=\frac{1}{n}\), this further simplifies to $$\int\limits_c\mathrm{d}f=\int_a^bf’\,\mathrm{d}x=\lim _ {n\rightarrow\infty}\sum _ {i=0} ^ {n-1}(b-a)f'(a+(b-a)\frac{i}{n})\frac{1}{n}=\lim _ {n\rightarrow\infty}\sum _ {i=0} ^ {n-1}f'(a+i\frac{b-a}{n})\frac{b-a}{n},$$ which is just the lower Riemann sum.

As hideous as this all is, though already natural in a kind of organic way, this all culminates into something euphorically beautiful.

Theorem: If \(\alpha\) is a \(k\)-form and \(c\) is a \((k+1)\)-chain, then $$\int\limits_{\partial c}\alpha=\int\limits_{c}\mathrm{d}\alpha.$$

See that theorem? It’s debatably the most important theorem in all of calculus and analysis.

The fundamental theorem of calculus states that \(\int_a^bf’\,\mathrm{d}x=f(b)-f(a)\). By our new theorem, called the generalized Stokes theorem, $$\int_a^bf’\,\mathrm{d}x=\int\limits_c\mathrm{d}f=\int\limits_{\partial c}f=f(b)-f(a).$$ Similar thinking leads to easy proofs of Green’s theorem and the divergence theorem from multivariate calculus. Even the residue theorem from complex analysis has a nice proof using this theorem.

Thus, the drudgery of all the new machinery we’ve developed is easily forgiven for this theorem, which is truly beautiful and natural.

Differentiation (again)

Occasionally, we encounter mathematicians that are unwilling to part with the idea of the derivative as a quotient of differentials. When I come across such people, I typically appease them using the following:

Define the division of two linearly dependent vectors \(v\) and \(w=\lambda v\) by \(w\div v=\lambda\). The exterior derivative of \(f:\mathbb{R}\to\mathbb{R}\), which is \(\mathrm{d}f=f’\,\mathrm{d}x\), and \(\mathrm{d}x\) are scalar multiples of each other at each point. That is, \(\mathrm{d}f_t=f'(t)\mathrm{d}x_t\). Thus, \(\mathrm{d}f_t\div\mathrm{d}x_t=f'(t)\).

I privately call this the Gooby derivative lemma, in reference to a series of webcomics depicting the Disney characters Donald Duck (Dolan) and Goofy (Gooby), where Donald crudely exploits Goofy’s lack of intelligence under the guise of friendship.


 Conclusion

We’ve discussed the benefits of thinking of \(\mathrm{d}x\) as a differential form and demonstrated some of the beauty of differential geometry. Hopefully, this has convinced you of the depth and power of calculus, and taught you that infinitesimals are not the only explanation for it.

References:

[1] Michael Spivak, A Comprehensive Introduction to Differential Geometry, 3rd ed., Vol. 1, Publish or Perish, 1999.

11 Comments

Subscribe to comments with RSS.

  • Muphrid says:

    With respect, I must still take the position that the differential seen in a common 1-dimensional integral is very different from what is meant by integrating a form.

    This perspective of mine comes from geometric calculus, the application of clifford algebra to calculus. Geometric calculus is capable of replicating the theory of differential forms, but in doing so, it exposes that differential forms often chooses a preferred orientation for the manifold of integration. This choice is often “preferred” merely for the order of the coordinate labels and is in fact quite arbitrary.

    In geometric calculus, the orientation is kept explicit by writing the tangent n-vector into the definition of the integral. The integration of an n-form then means that that n-form acts upon the n-vector, producing a function (scalar field) that can be integrated using a common, multidimensional Riemannian integral. Differential forms often chooses the orientation of the n-vector arbitrarily when this should be dictated by the problem at hand or other considerations. This arbitrary choice makes the notation for integrating differential forms unambiguous, at the cost of generality.

    Differential forms makes it seem as if we should identify “dx” in a Riemannian integral implicitly with the basis 1-form associated with the coordinate “x” that parameterizes a manifold, or at least suggests that these concepts are strongly related.

    Geometric calculus makes it clear that this is misleading, as once the n-form being integrated acts upon the tangent n-vector, there is still a conventional Riemannian integral to be done.

    • Hurkyl says:

      The common one-dimensional integral does have a preferred orientation: from the lower limit to the upper limit.

      In my experience, both applications and methods of computation tend rely crucially on the fact you’re integrating over oriented regions. Although in top degree in Euclidean space, there is some pedagogical value in avoiding differential geometry and faking an orientation with an unoriented integral by preordaining an orientation, and making sure you always respect the orientation (e.g. by putting absolute values around the Jacobian when making a change of variable, which has the effect of flipping the sign of the integrand whenever the transformation reverses the orientation).

      And it is clear that differential forms really are the right notion of integrand when you’re integrating over oriented regions; no matter which parametrization of the surface you take, you always plug the same differential form into the formulas. e.g. in the usual notation, $\int_S \omega = \int_{S’} \omega$.

      Although, I’m sure there are contexts where unoriented versions of integration are useful and even important, in which case different concerns become important and the above may not apply.

    • Muphrid says:

      Hurkyl: How do you handle integrals whose “lower” limits are greater than the “upper” limits? You interchange the limits in the integral symbol and tack on a minus sign. This is the same as saying that the orientation of the interval is merely “backward” compared to what’s conventional.

      I’m not arguing against the use of oriented regions; I’m saying differential forms relies choosing orientations for regions based on arbitrary criteria (most often, merely by how the basis is ordered) and treats such choices as implicit. This in itself is not a bad thing, but it contributes to the ongoing notion that basis covectors are “the same as” or “substantially related to” the differentials that appear inside integrals.

      I argue this is misleading because in geometric calculus, the differentials that appear in an integral do not–in any way, shape, or form–come from the differential form being integrated at all. Geometric calculus replicates the whole of the substance of differential forms–they do not disagree in any way as far as the result of a computation–so the meaning of notational concepts like the differential inside an integral should be compatible in both.

  • mvw says:

    The classical notation by Leibniz and contempories, which is still alive in mathematics for engineers and physics, is good enough to allow those customers to solve most of their problems.

    Learning the apparatus of modern differential geometry would require extra effort and probably add not much to the problem solving skills.

    So it will probably stay.

    • Hurkyl says:

      In my opinion, the apparatus of modern differential geometry is not needed to actually do calculations with differential forms — introductory calculus classes already sort of teach it, even while saying that’s not what’s really going on.

      I think there is actually a more fundamental problem. There are two versions of calculus: calculus of functions, and calculus of scalars. While both versions have the same content, they have very different notational flavor and teach you to think in somewhat different ways.

      Applications and calculations are, at least in my experience, predominantly in the form of the calculus of scalars. e.g. you have dependent variables x and y satisfying x^2 + y^2 = 1, and you’re relating variations in y to variations in x.

      However, in introductory classes, the theory is almost exclusively presented in terms of the calculus of functions, which teaches you to think of x and y as being very different kinds of things; e.g. you pick x to be the variable, and then treat y as an abuse of notation being used as shorthand for sqrt(1-x^2).

      In my opinion, treating dx and dy as objects in their own right only makes sense when you’re doing calculus of scalars, which means such ideas are inaccessible until students unless they either have a good intuition for these things, or until they finally advance to a subject that can’t get away with the calculus of functions approach and finally has to teach the calculus of scalars version. Which, I believe, is typically their first differential geometry course or their first algebraic geometry course.

  • I am delighted to see discussion arise from this. I’ll see if I can answer these points one at a time.

    Muphrid: In full sincerity, I never really looked at geometric calculus before; for some reason I’ve always assumed that it was a synonym for multiplicative calculus. At a glance, it looks very interesting, and I’ll be sure to look further into it when I find time.

    That being said, I still believe that differential forms are the the natural way to go, since they make many of the ideas of calculus arise of their own accord. This opinion might simply be because I am an abecedarian when it comes to math in general, though.

    mvw: Perhaps you missed the point. This post was designed to encourage exploration of differential geometry, and explain that the ideas of infinitesimals might be better explained by differential forms.

    Engineers and physicists are also not, strictly speaking, mathematicians, so they can enjoy the freedoms of empirical science. If you are doing physics, and you find “numbers smaller than any real number” intuitive, then you can make that choice. However, I know for a fact that forms are well-accepted in upper tier physics.

    This being said, of course they are going to stay. They haven’t been allowed to die for over 300 years, so why would that stop now?

    • Muphrid says:

      Yeah, I mean, if you know how to do things with forms well enough, the only thing geometric calculus offers is a different notational style, and perhaps a different mindset of doing calculations. One example would be the “difference” between taking a Hodge dual in forms versus clifford multiplying with an n-vector. You’re gonna get the same answer either way.

      Clifford algebra (and its associated calculus) appeals to me as a physicist because I almost exclusively work in a setting with some metric or pseudo-metric. The calculus treats the exterior derivative and interior one on the same footing, and in a setting without a metric, it’s easy enough to forbid any metrical operations (common enough in projective geometry, for instance). To me, this is somewhat cleaner than dealing exclusively with forms and then resorting to duality when needed, but I understand it’s a matter of experience and taste.

      Still, with respect to thi article, I make a point about the meaning of dx here as a result of more than one argument on the matter even on MSE. To me, the geometric calculus picture of integrating an n-form is very appealing: the form eats the tangent n-vector, producing a scalar function, which is then integrated according to established multivariable techniques. But, I do recognize that this merely changes the burden of how one defines the integration of a form: in forms notation, we just skip one step compared to geometric calculus, but GC still has to define that integration on an oriented manifold involves multiplying by that manifolds tangent n-vector.

  • One hardly needs to be an engineer or physicist to find the hyperreals intuitive. Indeed, I’m not at all sure that the extension from the reals to the hyperreals is any less intuitive than the extension from the rationals to the reals. But intuition is a very individual thing, and I freely acknowledge that my view is likely colored by the fact that I like set theory and find the kind of mathematics that obviously appeals to you both unattractive and uninteresting.

  • tomasz says:

    As a follow up question, one could ask what is $d\mu(x)$ (or $\mu(dx)$ or $d\mu$, whichever one you prefer)? When talking about abstract measure spaces, away from smooth manifolds, we still use more or less the same symbols, and it’s far from nonstandard usage, and it doesn’t make any sense to talk about this on its own: it’s just a part of the notation for the integral that highlights the measure and/or the variable with respect to which we are integrating.

    I certainly agree that differential geometry provides the language to express integrals of absolutely continuous (with respect to volume form) over orientable varieties, and it is surely the right way to formalise the otherwise seemingly nonsensical $u=f(t)$ leading to $du= f'(t) dt$ squiggles. On the other hand, introducing differential forms to freshmen would likely only serve to confuse them. So yeah.

  • User Thomas Klimpel has pointed out a mistake in the section Intuition-based Introduction. I wrote that $\mathrm{d}x_p$ is $0$ outside of the subspace generated by the partial derivative with respect to $x$ at $p$. However, I meant that $\mathrm{d}x_p$ is $0$ whenever the component in that subspace is $0$.

    In other words, if $\{x^1,\,\dots\,,x^n\}$ are the standard orthogonal coordinates on $\mathbb{R}^n$, then $\mathrm{d}x^1_p\left(\left.\frac{\partial}{\partial x^1}\right|_p\right)=1$ and $\mathrm{d}x^1_p\left(\left.\frac{\partial}{\partial x^i}\right|_p\right)=0$ for $i\neq 1$.

    Again, I would like to thank Thomas Klimpel for pointing out both the error in the post and the error in the hasty correction I was going to replace it with. The people who can edit the blog posts will probably fix the problem at their earliest convenience.

    • Let me try to answer my implicit question: “The reason I wrote it here is that I’m not yet familiar with how to write latex in comments on the blog.” Because using “$” doesn’t seem to work, I will try “\ (“, “\ [“, or “$ $” instead: “\ (“: > I wrote that \( \mathrm{d}x_p \) is \( 0\) outside of the subspace generated by the partial derivative with respect to \( x \) at \( p \).

      “\ [“: > In other words, if \( \{x^1,\,\dots\,,x^n\} \) are the standard orthogonal coordinates on \( \mathbb{R}^n \), then \[ \mathrm{d}x^1_p\left(\left.\frac{\partial}{\partial x^1}\right|_p\right)=1 \]

      “$ $”: > and $$ \mathrm{d}x^1_p\left(\left.\frac{\partial}{\partial x^i}\right|_p\right)=0 $$ for $i\neq 1$.

      I’m curious to whether any of these forms will work…