Integrating the Derivative

Dual to the problem addressed in the previous post is that of when the result of the fundamental theorem of calculus holds true, namely that F(b)-F(a) = \int^b_a F'(x)dx. It turns out the condition of absolute continuity defined at the end of the last post is sufficient. First, define the variation of a complex-valued function F over a partition a = t_0< \cdots < t_N=b to be \sum^N_{i=1}|F_i-F_{i-1}|. It is easy to see that variation increases in the “fineness” of the partition; we say that a function is of bounded variation if the supremum of the variation over all partitions, the total variation T_F(a,b), is finite. This will be the case, roughly speaking, for functions that do not oscillate too widely or too frequently. We can see that real, monotonic, bounded functions, as well as differentiable and bounded functions, are of bounded variation.

The first result of this post will show some of the motivation for studying these functions.

Result 1: Bounded variation implies differentiability almost everywhere.

Proof: To prove this, let’s first narrow our focus solely to increasing functions. We can do this because of the following characterization of functions of bounded variation: they are precisely the functions that are differences of two increasing bounded functions. One direction is obvious, that the difference of two increasing bounded functions is of bounded variation. To prove the other direction, define positive and negative variation P_F(a,b) and N_F(a,b) over the interval \sup\sum_{(+)}F(t_j)-F(t_{j-1}) and \sup\sum_{(-)}F(t_j)-F(t_{j-1}), where the sums are taken over all positive and negative differences, respectively. We immediately have that F(b)-F(a) = P_F(a,b) - N_F(a,b) and T_F(a,b) = P_F(a,x)+N_F(a,x). Then to prove the other direction of our claim, simply take the functions to be P_F(a,b)+F(a) and N_F(a,b).

Next, we’ll prove the lemma that forms the crux for our proof of almost-everywhere differentiability.

Riesz’ Lemma: For a real, continuous function G, let E be the set of points x such that there exists to the right of x some x+h_x such that G(x+h_x)>G(x). E is the union of open intervals (a_k,b_k) so that on each interval, G takes the same value at both endpoints except in the case of the interval starting at a_1 = a for which G(a_1)\le G(b_1).

Proof: The proof is evocative and beautiful: think of the function as a series of rolling hills and imagine a sun shining to the right of the function, The shadows on the hills are precisely the intervals composing E, and each contiguous shadow certainly starts where it ends except for the first one! The other way to prove this is just to apply the intermediate value theorem repeatedly.

Returning to the proof, define the coarse differential quantity \Delta_h(F)(x) = \frac{F(x+h)-F(x)}{h} and the four Dini numbers D^+, D^-, D_+, D_- to be the limit superior of \Delta_h(F)(x) from the right and left, and the limit inferior of \Delta_h(F)(x) from the right and left, respectively. Because we have that D_-\le D^-\le D_+\le D^+, it suffices to prove that D^+ is finite and bounded above by D^-.

It is time to use our lemma: for a given \gamma>0, let E_{\gamma} be the set of x for which D^+(F)(x)>\gamma. We want to show that the measure of E_{\gamma} shrinks arbitrarily as \gamma becomes large, which would imply that D^+<\infty almost everywhere. Apply Riesz’ lemma to the function G(x) = F(x) - \gamma x and note that E_{\gamma} is part of the union of open intervals described in the lemma: if a point satisfies D^+(F)(x)>\gamma, then F(x+h)-F(x)>h\gamma for some h and thus there exists to the right of x some x+h such that G(x+h)>G(x). So the measure of E_{\gamma} is bounded by \sum m((a_k,b_k)) where G(a_k)\le G(b_k) so that F(b_k)-F(a_k)\ge\gamma(b_k-a_k) . Thus \sum_k m((a_k,b_k))\le\frac{1}{\gamma}\sum_k F(b_k)-F(a_k), and because F is increasing, this is bounded above by \gamma \frac{1}{\gamma}(F(b)-F(a)), proving as desired that m(E_{\gamma}) becomes arbitrarily small as \gamma increases.

We use the same approach to prove that D^+\le D_- almost everywhere: define a set that can be used with Riesz’ lemma and prove it’s of measure zero. For fixed R>r, let E be the set where D^+>R and D_-<r. Because this is for arbitrary R,r, we just need to prove m(E)=0. To the contrary, assume that m(E)>0. Then there is some O between (a,b) and E such that m(O)<m(E)\cdot R/r. O is the union of disjoint open intervals I_n, so pick one; we’ll use Riesz’ lemma twice. Basically, in each I_n, we’re going to construct a set O_n which is small enough relative to I_n (in fact, by a factor of \frac{r}{R}) but still contains E in I_n. Then m(E) = \sum_n m(E\cap I_n)\le\sum_n m(O_n)\le\frac{r}{R}\sum_n m(I_n) = \frac{r}{R}m(O)<m(E), a contradiction stemming from the strict inequality which arises from our false assumption that m(E)>0.

Because we are approaching from the left in the case of D_- and our lemma considers high values lying on the right, we reflect through the origin by considering G(x) = -F(-x)+rx and obtain a union of open intervals, and reflecting these back through the origin, we get some \cup(a_k,b_k) such that G(a_k) = G(b_k) and thus F(b_k)-F(a_k)\le r(b_k-a_k). For each of these subintervals (a_k,b_k), further apply the Riesz lemma using G(x) = F(x) - Rx to get another layer of subintervals (a_{k,j},b_{k,j}) so that F(b_{k,j})-F(a_{k,j})\ge R(b_{k,j}-a_{k,j}). Taking the union of all these subintervals, we get a set O_n whose measure we can deduce, by the fact that F is increasing, to be bounded above by \frac{r}{R}m(I_n), as desired.

Now that we have differentiability almost everywhere, we can approximate F' by convergents G_n(x) = \frac{F(x+1/n)-F(x)}{1/n} and then invoke Fatou’s lemma on these convergents to get that for F increasing and continuous, \int^b_a F(x)dx \le F(b)-F(a). By our characterization of functions of bounded variation, this inequality extends to those functions as well.

But it turns out we can do no better than this inequality without narrowing our focus further. The Cantor-Lebesgue function is one such counterexample; the basic premise is to construct a series of continuous “stairstep functions” where the steps come at the endpoints in the Cantor set convergents. In this case the derivative is almost everywhere zero, but F(1)-F(0)= 1. Specifically, define the convergents F_n of the Cantor-Lebesgue function such that F_n(0) = 0, F_n(x) = \frac{k}{2^n} on the complement of C_n, and F_n is linear on C_n. An image I like to use to visualize this is that we’re climbing from height 0 to height 1 in such a way that even when we make it to 1 at the end, our vertical progress has been so slow that it might seem like we’d traveled nowhere.

What it takes to get the fundamental theorem of calculus to hold is the stronger condition of absolute continuity. Recall that the generalized definition is that for any \epsilon>0 there exists a \delta>0 such that m(E)<\delta implies \int_E |F|< \epsilon. In the one-dimensional case, a function F is absolutely continuous if for every \epsilon>0 there exists a \delta>0 such that \delta>0 (i.e. \ \sum^N_{k=1}|F(b_k)-F(a_k)|<\epsilon for all disjoint intervals (a_k,b_k) such that their combined length is less than \delta.

It is easy to see that absolute continuity implies uniform continuity and, more importantly, bounded variation and thus differentiability almost everywhere. We will prove that if the derivative of an absolutely continuous function is zero almost everywhere, the function is constant. This gives our desired theorem, one direction of which was proven in the last post:

Result 2: If F is absolutely continuous, then \int^x_a F'(x)dx = F(x)-F(a). On the other hand, if fis integrable, then F(x) = \int^x_a f(y)dy is absolutely continuous and F'(x) = f(x).

Proof: In the former direction, the equation to be proven makes sense because the integral of the derivative, which for convenience let’s denote by G(x), by our bound from Fatou’s lemma, is finite and functions of bounded variation are differentiable almost everywhere. But then we can use the first main result of our last post: local integrability implies that the limit of \frac{1}{m(B)}\int_BF'(y)dy = F(x) almost everywhere, so in other words, G'(x) = F'(x) almost everywhere, so G-F is constant by what we will prove later. So F(b)-F(a) = G(b)-G(a) =\int^b_a F'(x)dx.

In the other direction, we already know this by the Lebesgue differentiation theorem of the last post.

To conclude this post, we prove that zero derivative almost everywhere implies constant.

Proof: It suffices to prove that at the endpoints of the interval [a,b], F takes on the same value, because then we can restrict to any subinterval we wish. Let E be the set of x for which F'(x) = 0; by hypothesis it has measure b-a. By definition, we can find for every x and every \nu>0 a neighborhood (a_x,b_x) of width \nu around x such that |F(b_x)-F(a_x)|\le \epsilon(b_x-a_x). This is an example of a Vitali covering, a covering by balls for every point in the set and every radius \nu, there is a ball containing that point and of measure less than \nu. It turns out that, roughly speaking, for any Vitali covering we can find a disjoint sub-collection of balls whose combined measure is arbitrarily close to that of the set being covered, i.e. for any \delta>0 there are finitely many disjoint B_1,...,B_N such that \sum^N_{i=1}m(B_i) \ge m(E)-\delta.

If we can prove this, then we can find a disjoint sub-collection of intervals I_i = (a_i,b_i) such that \sum^N_{i=1}m(I_i)\ge m(E)-\delta = b-a-\delta and such that \sum^N_{i=1}|F(b_i)-F(a_i)|\le\epsilon(b-a) which can be made arbitrarily.  Because we’re able to shrink the complement as we wish by our Vitali result, by absolute continuity the same sum of |F(\beta_k)-F(\alpha_k)| over all the intervals making up the complement can be made arbitrarily small, so we are done.

So it remains to show the result for Vitali coverings. In fact we can make use of our covering argument from the previous post: given a finite collection of balls, we can get within a factor of 3^d using a finite sub-collection. The argument is then simply: i) pick a subset E'\subset E which si compact and has at least measure \delta, ii) cover it by finitely many balls and invoke the 3^d result to find a sub-collection which is at least 3^{-d}m(E')\ge 3^{-d}\delta, iii) if this is at least m(E)-\delta, we’re done, otherwise repeat on E minus the sub-collection \cup \bar{B_k} of the current balls’ closures and repeat for all balls in the original Vitali covering which do not intersect any of the B_k.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s