Non-nonstandard Calculus, I

By mnoonan

I alluded in one of my very first posts here to a calculus class that I was teaching using the ring \mathbb{R}[dx]/(dx^2 = 0). The class was a six-week, 5-day-per-week intensive course covering the usual material of a college Calculus I course. I’ve been promising Greg and Jim that I’d write up some of my experiences with the course. I think I have had enough time to process the experience — let’s talk non-nonstandard calculus.

Maybe a word about the name before we begin: I think that a lot of my philosophy on mathematics came from taking several analysis classes at Smith College from logician Jim Henle. I learned nonstandard analysis from him one summer, but also learned what he called “non-nonstandard analysis”. The idea was to get infinitesimals into calculus without the big ultrafilter-style logic machinery of standard nonstandard analysis. I’d like to say more about this version of non-nonstandard analysis sometime in the future, but not just this moment. But the main idea is this: we want infinitesimals, but we don’t want to bring in the heavy machinery if we can avoid it. The (serious) tradeoff is that we lose the transfer principal when we use non-nonstandard analysis.

Now, a bit more about the class I taught. Most students had not seen much beyond precalculus. Maybe they had been told how to formally take derivatives of polynomials, but not much more. So I had a fantastically clean slate to work with. This may have seriously affected the outcome of the class.

In the first serious lecture, I introduced dx as a positive number so small that dx^2 = 0. They thought this was a little funny, but came around to it after I drew an analogy to i and i^2 = -1: even if they had philosophical objections to thinking about i or dx as “numbers”, their recent experience with complex numbers in high school put them in a good place for accepting dx as a useful formal symbol, if not an honest-to-goodness number.We could then quickly move on to computing the derivative of some elementary functions like x^2:

\frac{(x + dx)^2 - x^2 }{dx} = \frac{x^2 + 2x dx + dx^2 - x^2}{dx} = \frac{2x dx}{dx} = 2x

I also pointed out that computers are using a number system which is closer to this than it is to \mathbb{R}: since floating point processors have finite precision there is a smallest positive denormal. It must square to zero. So if you are judging mathematical truth by “is relevant to things I know in the real world”, it seems that nilsquare infinitesimals aren’t such a strange idea.

On the second day, I gave them a few useful tidbits: as an axiom, I said e^{dx} = 1 + dx. Then by looking at a right triangle of height dx, we decided that \cos(dx) = 1 and \sin(dx) = dx. From here, it is easy to compute derivatives of relatively complex functions by hand. By the end of the third day, there were homework problems (generally solved correctly) like “use the definition of the derivative to find the derivative of x e^x”:

(x + dx) e^{x + dx} = (x + dx) \cdot e^x \cdot (1 + dx) = e^x (x + dx + x dx + dx^2) = x e^x + e^x(1 + x) dx

so the derivative is e^x + x e^x.

By asking them to compute the derivative of things like \cos(x) or e^{e^x} by hand, they rapidly re-learned (and remembered!) the angle-sum formulas for sine and cosine and the algebraic rules involving exponents. I never allowed formula sheets or calculators or anything like that in the class or on the tests — the kids were genuinely remembering the formulas and the derivations. Even the weaker students at the end of the course could all compute the derivative of \cos(x) as a routine computation:

\cos(x + dx) = \cos(x) \cos(dx) - \sin(x) \sin(dx) = \cos(x) - \sin(x) dx

but even better, they remembered e^a e^b = e^{a + b} or \cos^2 x + \sin^2 x = 1 because these manipulations had become common to them. They used them every day in nearly every problem, and they remembered. No formula sheet hell involved.

When a number like a + b dx appeared in the denominator, the students quickly figured out that they could multiply on the top and bottom by a - b dx to “realify” the denominator. I was surprised that they saw this trick so quickly, until I realized that they had been doing the exact same thing with complex numbers for several years already. The derivative of \tan(x), with no more than two steps left out:

 \frac{\sin(x + dx)}{\cos(x + dx)} = \frac{\sin(x) + \cos(x) dx}{\cos(x) - \sin(x) dx} = \frac{(\sin(x) + \cos(x) dx)(\cos(x) - \sin(x) dx)}{\cos^2(x)} = \tan(x) + \frac{1}{\cos^2(x)} dx

On the final, I asked them to compute the derivatives of x^2 e^{2x}, x/\cos(x), and e^x \sin(x) from the definition. Almost uniformly, the students could create correct computations of the derivative. The next question on the final was “prove the quotient rule”. Nearly everybody in the class, even the C-students, did this derivation properly. They were not told beforehand that anything like this would be on the test.

Another thing which they were fairly good at was using differentials, since the differential was not some mysterious formal symbol to them but an actual infinitessimal value. From this perspective, it is easy to see how to use differentials to get good approximations. This also made it as easy to work with implicit differentiation as with explicit differentiation, which it turn makes computations of related rates much cleaner than usual. From my perspective, it also makes the usual geometric ways of demonstrating the product rule or the fact that (\pi r^2)' = 2\pi r more rigorous: the area added really is just some lengths times the width of my chalk. By treating infinitesimals on the same footing as finite numbers, the approximation schemes of calculus become more intuitive. I think every mathematician has discovered this on their own, in their own private language. Why not make the language commensurable with the computations that we do?

I’ll write plenty more about the other topics of the course in the future, but this should give you some idea of what the class was like and why I felt that nilsquare infinitesimals were a productive way to teach calculus. We really didn’t use limits at all: they were all replaced with infinitesimals and approximation schemes. It is true that limits, infinitesimals and approximation schemes are all equivalent ideas, but I believe the latter two more closely model our internal, geometric understanding of calculus. So: what do you readers think?

28 Responses to “Non-nonstandard Calculus, I”

  1. John Armstrong Says:

    Actually, it reminds me of something Louis Kauffman has on his door. It’s all of integral and differential calculus on two pages, and the only difference is he uses a lowercase delta instead of dx.

    My qualm here is that you manage to completely bypass limits. What effects do you think this will have when the students hit series, if they ever do?

  2. James Says:

    Very nice. I would be really interested in what you think the disadvantages of this approach are. There must be many, as some sort of balance to the many advantages. Probably the best way to think of them is to teach a course and see what happens.

  3. Jurgen Says:

    I am very intrigued by the nonstandard approach to calculus. Could you point us to some good introductory books/lecture notes?

  4. Shaneal Manek Says:

    Is this just a rephrasing of automatic differentiation with dual numbers? It seems that your students may run into trouble later because they don’t understand enough abstract algebra to know what’s going on, and may just resort to manipulating symbols without any real understanding when things get tough.

    I’ve never been much of a calculus guy (barely learned enough to get through diffeq and real analysis), so please excuse any ignorance on my part.

    Thanks

  5. John Armstrong Says:

    Students … may just resort to manipulating symbols without any real understanding when things get tough.

    And that’s not what they do now?

  6. Charles Says:

    So does non-nonstandard analysis have anything to say about the integral? It seems nice for differentiation, but I don’t immediately see how to use the dual number approach to integrate.

  7. davidspeyer Says:

    This looks very nice. I’ve been curious for a while whether I could teach calculus using big O notation as the fundamental object instead of limits, and this looks like a similar but easier approach. Question — what do you plan to do when you get to Taylor series? When I do calculus computations, that’s when I start really hauling out the big O’s. (For example, try computing the first three nonzero terms of the Taylor series for sqrt(cos(x)) by repeated differentiation, and then do it again by expanding (1-x^2/2+x^4/24+O(x^6))^{1/2} by the binomial theorem.)

  8. Terence Tao Says:

    This is indeed a nice way to do differentiation, but it may run into some trouble once one leaves the category of smooth or analytic functions. For instance: is x^{3/2} differentiable at x=0? What about |x|? etc. More generally, it is hard to detect non-differentiability in this setting. (This is the usual phenomenon in algebra that it is easy to prove two things are equal, but difficult to prove that two things are unequal. Analysis, of course, has the opposite problem :-) .)

    Also, one may encounter some difficulty with second derivatives. One wants to assert the formula f''(x) = \frac{f(x+2dx) - 2f(x+dx) + f(x)}{(dx)^2}, but you can’t do that within the ring {\Bbb R}(dx)/(dx^2 = 0) – division by zero error! For similar reasons, it is a little tricky to justify any Taylor expansion beyond first order.

    Finally, it’s not particularly easy to show f'=0 \implies f = \hbox{const} in this framework, which will lead to some problems when one gets to the indefinite integral. Nevertheless, one can at least get the fundamental theorems of calculus by postulating the existence of a definite integral operation (a, b, f) \mapsto \int_a^b f which obeys some reasonable axioms (linearity, concatenation, translation-invariance, and most importantly that \int_x^{x+dx} f = f(x)\ dx).

    Things also get rather interesting in several variable calculus: what ring would you use, for instance, to prove Clairaut’s theorem \frac{\partial^2 f}{\partial x \partial y} = \frac{\partial^2 f}{\partial y \partial x}? (and how do you deal with the fact that this theorem can in fact fail for certain twice differentiable functions?). And Stokes theorem is going to be particularly challenging without using much more of the machinery of infinitesimals than just dx^2 = 0.

    In short, it is a nice way to get students quickly started on single-variable differential calculus, but one should also be prepared to move beyond this approach when one wants to tackle the rest of the calculus.

  9. Terence Tao Says:

    p.s. This approach is excellent for introducing the concept of a tangent bundle, and interpreting the derivative of a map between manifolds as a linear map between two tangent bundles. Indeed, one can argue that this approach is basically the algebraic way of viewing R as a manifold. This also clarifies a bit the difficulties mentioned above; second derivatives on manifolds are indeed a real pain, one needs the machinery of connections. Also, now that one has some non-trivial global cohomology, it’s clearer as to why things like Stokes’ theorem are not going to be trivial.

  10. mnoonan Says:

    Re: John’s question about bypassing limits. As we got further into the course, and the student’s got more used to approximation schemes (differentials, Newton’s method, Riemann sums for area and Riemann sums for solving a differential equation), I think that their ability to think about limits developed on its own. We talked about limits long enough to discuss L’Hospital’s rule near the end of the course. During this class, we developed a translation dictionary on the board between limits and infinitesimals. Those who had seen the derivative defined before had a “oh, I see where that h thing came from” moment. It was heart-warming. I have trouble believing that there is enough content in the idea of “limit” that a decent students couldn’t pick it up rapidly when they begin to deal with series in Calc 2. This use of “limit” is just “this sequence of approximations is as good as you would like”, after all.

    I am planning on emailing the students after another several weeks have passed to see how they are doing in the next calculus course. I think this will be the real test of the idea, and I will be sure to pass the results along.

  11. mnoonan Says:

    Shaneal, you are right that this is the same idea that is exploited with automatic differentiation. But I don’t think the students saw it as purely formal — I talked about infinitesimals on equal footing with real numbers, we drew chalk-sized infinitesimal things in our diagrams, and most importantly the students began to correctly use the language: “A and B are infinitesimally close” and so forth.

    I’m not sure, but I suspect that by using honest equations rather than limits in (for example) the definition of the derivative, the students are more inclined to understand what is going on. The definition of the derivative via dx is, after all, just an equation; the one with limits carries an implicit “for all / there exists” in front of it. The more \forall \exists alterations, the harder it is to understand the underlying idea.

  12. Terence Tao Says:

    p.p.s. There is another slight issue, which is to decide as to whether dx is “positive”, “negative”, or “neither” (the latter is what happens of course with i = sqrt(-1)). This becomes relevant when one wants to show that a function has derivative zero at local extrema. One can work around this issue by having both positive and negative infinitesimals (and thus deal with right and left derivatives), but this approach does not extend well to several variables.

    It’s also worth noting that Ito calculus is traditionally presented using this algebraic infinitesimal approach, by adjoining an additional relation d\omega^2 = dx.

    One final advantage (or disadvantage, depending on your point of view) is that this approach does not really use the structure of the reals and so also applies over arbitrary fields. As such it is highly compatible with algebraic geometry, schemes, etc. (This also explains why it is not so terribly compatible with the definite integral, or the detection of local extrema.)

  13. fjfjj Says:

    This approach seems to me problematic for three distinct reasons.

    First, there is little benefit. Actually computing derivatives can be taught in half-a-day with a few formulas. You need only the product, quotient, chain, and power rules, as well as a couple of memorized formulas (derivatives of exp, ln, cos and sin). Since this is not a problem, there is no reason to modify the way to teach the simplest part of calculus. The hard parts of calculus are limits, series, integrals, extrema, n-dimensionality, discontinuities, differential equations, and word problems. Derivative computing is just not hard for anyone who will be able to master the rest of calculus (those for whom it is hard won’t learn calculus anyway).

    Second, the approach does not prepare students to read and understand works written by the remainder of the mathematics, physics and engineering community.

    Third and most important, the approach misconstrues the purpose of teaching calculus. This purpose is not solely to teach students to manipulate derivatives. Instead, it is to teach them to understand concepts of rigor, proof and meaning. Calculus is the course where teachers should not lie to students about the true meaning of what they do. This method, even it works to the extent of being useful mnemonically, shortchanges students because they are deceived as to why it works. I don’t see that students have any real understanding of derivatives using this method, even if they can compute them.

    That said, the differential notation is nonetheless powerful and useful in its own right, but should only be introduced after students have understood the limit formalism. Thus, I would introduce limits, and them spend some time on the differential notation.

  14. mnoonan Says:

    Terry: one sharp kid pointed out exactly your point about second derivatives in one of the first classes: “but don’t we write d^2 f / dx^2 for second derivatives?” The problem is really that when you need to do things like take second derivatives, you need to keep track of two potentially incommensurate infinitesimals at once. Some of our early posts on the blog mentioned how even if dx and dy are nilsquare, unless you know that dy/dx is finite then the sum (dx + dy) is only nilcube. (dx + dy) becomes nilsquare again if you propose that dx dy = -dy dx, and suddenly it seems that we are about to start a conversation on \Omega^k_M with freshman. This is certainly a big drawback to the approach: infinitesimals on different orders of magnitude are too messy, so you have to stick close to situations involving only one infinitesimal scale. For a calculus 1 course, this is not so strict a requirement. For a calculus 3 or analysis course, it is probably not worth the effort.

    You can, of course, use nilcube, nil(4th), … infinitesimals to compute second and third derivatives directly, using exactly the formula that you have written. This is also one way to tackle Taylor series in the same spirit.

    Detecting non-differentiability is not so bad after all. In fact, it is remarkably easy to do on the class of “high school functions”. On one test, they had to show that |x| was continuous but not differentiable at x = 0. To see continuity, you only need to note that the three (and only three!) numbers |-dx|, |0| and |dx| are infinitesimally close. To see that the derivative doesn’t exist, you only need to compute the left and the right derivatives (using dx and -dx) to find out if they agree. I want to write much more about this aspect of the class soon, because it was one of the most philosophically confusing aspects of the course to me.

    Your description of how we have to approach integration is pretty dead-on. More on that in a future installment ;)

  15. mnoonan Says:

    fjfjj: Your criticisms are reasonable, but I think misguided. I overemphasized computing derivatives from first principles in this post, but the computation of derivatives was of course only a very small portion of the class. I will be writing about the other aspects of the class in the near future. I believe that for most students, the hard part of calculus is modeling: they do not know how to translate ideas about the world to equations which they can manipulate, and they don’t know how to take an equation and create a mental model of what it represents. This is manifested in how strongly they avoid word problems of the optimization or related rates types, even though these problems are almost trivial once an actual equation has been written down.

    The idea of this course was not really “look at this funny trick for computing derivatives” but rather “can I teach calculus so that the symbolic methods we use more closely model the intuitive ones”? It is not clear that these students were better than my previous limit-ed ones at word problems, but they certainly weren’t any worse.

    As for your second reason, I simply can’t agree. Even while I was still teaching the class, one student commented that her physics instructor was impressed that she was learning calculus “the right way” [with infinitesimals]. In his work, the infinitesimal displacements were real quantities to be manipulated.

    As for your third point, I don’t understand how it deceives the students. It works for the exact same reason limits or O-notation or nonstandard analysis any other formulation of calculus works — it is a good model of the language of calculus. In at least this one case, the students came away with a better understanding of derivatives than usual. They can understand the derivative directly, getting their hands right on it through computation or seeing the tangent line between two infinitesimally close points. I’d like to better understand why this approach seems deceitful to you, since I have received similar reactions from several other people.

  16. This week in the arXivs… at It’s Equal but It’s Different Says:

    [...] Non-nonstandard Calculus, I [...]

  17. jpivarski Says:

    As a physics major with an undergrad math focus, I had a chip on my shoulder about infinitesimals versus limits. In physics classes, we talked about infinitesimal quantities all the time, but I always mentally translated them into limits, believing that this was the only correct way to think about it. I ran into calculational difficulties with this mental model in grad school, finding that my peers could solve problems I couldn’t even get my head around, and, after some discussion, we found that it was because they truly believed in infinitesimals. When I decided to admit them as a useful fiction, statistical mechanics got a lot easier. It would have been great to have seen both tools, and to have known the limitations of each (though that can be hard to teach to students who are looking for The Right Way to Think).

    Thanks for the great post, Matt! By the way, what kinds of students were you teaching: engineering, physics, or math majors? Were they honors students or general?

  18. Anonymous Says:

    Cos(dx/2) = 1;
    Sin(dx/2) = dx/2;
    Sin^2(dx) = dx^2=0;
    from here:
    0 = Sin^2(dx) = 2*Sin(dx/2)*Cos(dx/2) = 2 * dx/2 * 1 = dx > 0
    0>0, congratulations.

  19. mnoonan Says:

    Is “Anonymous” somehow managing to confuse \sin^2(x) with \sin(2x)?

  20. Jesse Says:

    dx is a zero-divisor in the ring R[dx]/(dx^2), so if you invert it it becomes zero in the localization. How then can you algebraically justify dividing by dx?

  21. Todd Trimble Says:

    “How then can you algebraically justify dividing by dx?”

    You don’t. You instead write f(x + dx) = f(x) + f’(x)dx for a unique scalar f’(x). This points to the difference between infinitesimal analysis using invertible infinitesimals (as in Robinson’s Nonstandard Analysis) and analysis which uses nilpotent infinitesimals (as in “Synthetic Differential Geometry”; see e.g. Mike Shulman’s paper here).

  22. Math Bloggers - Today’s Top Blog Posts on Mathematics - Powered by SocialRank Says:

    [...] Non-nonstandard Calculus, I « The Everything Seminar [...]

  23. Anonymous Says:

    No, I didn’t get it mixed up – you got your . . . face . . . mixed up with . . . something stupid . . . I . . . . . . . .am a secret genius that the world has yet to discover! And I prove it by leaving obviously incorrect proofs misusing high school math on message boards without having the courage to leave my name hoping to score points from a safe distance that only I will keep track of on my mental scorecard (of course erasing this instance and most likely many others like it out of memory so as to maintain my delusion of superior intelligence . . . .(shhhhhhh!))

  24. Mark Hanlon Says:

    Congratulations. You seem to have discovered the main features of smooth infinitesimal analysis (SIA). This is a version of analysis based on the repudiation of the general applicability of the law of excluded middle (LEM) and the principle of microstraightness of smooth functions. Nilsquare infinitesimals emerge naturally from these principles. SIA is entirely rigourous (its foundations are in Category theory) and it has at least four major advantages over Limit theory and non-standard analysis (NSA – which is just Limit theory in disguise). These are:
    1. In SIA the differential calculus is reduced to simple algebra.
    2. SIA does not lead to contradictions such as the Banach-Tarski
    paradox as does Limit theory/NSA.
    3. The method of microadditivity found in physical derivations is a
    natural application of SIA but not of Limit theory/NSA.
    4. The ‘taking the standard part’ fraud of NSA is unnecessary in SIA
    because the infinitesimals cancel eachother out.
    The best book on SIA is A Primer of Infinitesimal Analysis by J L Bell. Many of the criticisms of your approach given above are addressed in his book. I personally found it useful to compare Bell’s book with The Foundations of Mathematics by Stewart and Tall. In their book Stewart and Tall use the quest to explain calculus to justify the Completeness axiom and classical Real analysis – a justification which falls apart with SIA. They also use the Axiom of Choice in their coverage of Cardinal numbers; but this axiom also implies the LEM thereby disallowing nilsquare (that is, genuine) infinitesimals. Consequently, you can believe in infinite numbers or infinitesimals but not both, or at least not both at the same time. This may explain Cantor’s objection to infinitesimals! It is interesting to note that Fuzzy Logic also depends on the repudiation of the LEM. Calculus and Fuzzy Logic are perhaps the two branchs of mathematics which are the most useful in modelling reality. Perhaps Cardinals, ‘Real’ analysis, and the LEM should be banished to the fringes of philosophy.

  25. Michael O'Connor Says:

    I recently wrote up some notes on Smooth Infinitesimal Analysis, the system for which the immediately preceding commenter is an aggressive advocate.

    They’re at:
    http://www.math.cornell.edu/~oconnor/sia.pdf

    The biggest contrast between Smooth Infinitesimal Analysis and Matt’s system (and non-standard analysis as well) is that while in both Matt’s system and non-standard analysis the object representing reals+infinitesimals is explicitly constructed and can be manipulated directly with the usual classical logic that mathematicians are used to, in Smooth Infinitesimal Analysis, the object representing reals+infinitesimals is presented axiomatically, and you must reason about it using intuitionistic logic in order to make the existence of infinitesimals consistent.

    It sounds forbidding, but you can do a lot of pretty neat things with it.

  26. Todd Trimble Says:

    I fear that some of the above comments may be slightly misleading. The objects that SIA deals with do not reside merely at the level of axiomatic description; they may be embodied in models which admit explicit constructions by taking categories of sheaves on appropriate sites. Externally speaking, one uses ordinary logic to describe these sites. It is of course true that the “internal logic” in such sheaf toposes is intuitionistic (meaning that lattices of subobjects of objects are not Boolean algebras; they are Heyting algebras), and it is in that sense that the inclination to manipulate the relevant objects directly as “smooth sets” must take intuitionistic logic into account.

  27. Michael O'Connor Says:

    You’re right, of course. I should have mentioned the models in my summary.

    My reasoning for not doing so is that if one were to teach calculus via SIA, one would do so axiomatically and not via the models, just like we teach math majors to reason axiomatically in ZFC before we teach them (if we ever do) how to construct models of ZFC with set-theoretic forcing.

    However, for people who are not learning calculus for the first time, learning how SIA works by going through the construction of the models may be the most enlightening way to do it. It could certainly help a classical mathematician who may be suspicious of or not fully understand intuitionistic logic to figure out what exactly is going on.

  28. Alessio Says:

    Hi Folks, here is the “true” about the way to make pratical calculus, for Year when I’m at the University, the teacher tell me, “You cannoit semplify” the derivates, but in any kind of calculation they use the derivate as Fraction ;)

Leave a Reply