# Differentiating the Integral

Now let’s put the machinery we’ve built up to use in making precise the familiar notion of differentiation and integration being dual to each other. It is easy to see in one direction why this makes sense: roughly speaking, the derivative of an integral is the limit of the average value of a function over a neighborhood as the measure of that neighborhood approaches zero. For this reason the first problem we will address in this post is called the averaging problem, namely, if $f$ is Lebesgue-integrable, do we have that $\lim_{m(B)\to 0, x\in B}\frac{1}{m(B)}\int_Bf(y)dy = f(x)$ for almost all $x$?

Result 1: We can answer the averaging problem in the affirmative.

Well it’s certainly in the affirmative for continuous functions; one of the keys is to observe that continuous functions of compact support are dense in the space $L^1$ of Lebesgue-integrable functions. We already know simple functions are, so step functions are as well, and we can easily find arbitrarily close approximations to the most basic step function, namely the characteristic function of a rectangle, by continuous functions of compact support, so now for any $f$ in $L^1$, approximate by such a continuous function $g$ so that $\left|\left|f-g\right|\right|$ can be made arbitrarily small. Then we can rewrite $\frac{1}{m(B)}\int_Bf(y)dy - f(x)$ as

$\frac{1}{m(B)}\int_B(f(y)-g(y))dy + \left(\frac{1}{m(B)}\int_Bg(y)dy - g(x)\right) +(g(x)-f(x)).$

Take the limit superior of both sides over balls that contain $x$, and because $g$ is continuous, the middle term on the right vanishes. We want to show that for any given $\alpha$, the measure of the set $E_{\alpha}$ for which the limit superior of the left, i.e. the difference between $f$ and the limit of its average value, exceeds $\alpha$ is zero. By Chebyshev’s inequality, $|g(x)-f(x)|>\alpha$ on a set of measure $O(\left|\left|f-g\right|\right|)$, and $\limsup\frac{1}{m(B)}\int_B(f(y)-g(y))dy\le (f-g)^*(x)$, where $(f-g)^*=\sup\frac{1}{m(B)}\int_B|f(y)-g(y)|dy$, the so-called Hardy-Littlewood maximal function.

It suffices to show the set for which the maximal function exceeds $\alpha$ has measure also on the order of $\left|\left|f-g\right|\right|$. Indeed, it turns out that $m(E'_{\alpha})$ is at most $\frac{3^d}{\alpha}\left|\left|f-g\right|\right|$. The $3^d$ term comes from the fact that in any finite collection $C$ of open balls, there is a sub-collection of balls that are disjoint such that their individual measures total to at least $\frac{1}{3^d}$ the measure of the union of $C$ (this follows easily from the fact that if we blow up the biggest ball ball $B$ to thrice its radius, it’ll contain any balls that would have intersected $B$).

Now for each $x$, we can find a ball $B_x\ni x$ so that $\frac{1}{m(B_x)}\int_{B_x}|f(y)-g(y))|dy$ and thus $m(B_x)<\frac{1}{\alpha}\int_{B_x}|f(y)-g(y)|dy$, and balls of this form cover $E'_{\alpha}$. Pick a compact subset $S$ of $E'_{\alpha}$ so that we can pick a finite subcover $C$ and then apply the above covering result to get a disjoint subcollection. Then $S$ has measure bounded above by $\frac{3^d}{\alpha}\int_{\mathbb{R^d}}|f(y)-g(y)|dy = O(\left|\left|f-g\right|\right|)$, and because we picked $S$ arbitrarily, this inequality holds for $m(E'_{\alpha})$ as well, and we are done.

In fact, we can do better. The assumption of global integrability seems like overkill considering differentiability is only a local property. Define a measurable function to be locally integrable if for every ball $B$, $\int_B f<\infty$ almost everywhere. Our proof above implies that local integrability is sufficient. And what if we want to integrate over sets other than balls? Any sets $U_{\alpha}$ of bounded eccentricity at $x$, i.e. such that there exists a ball containing the set and $x$ such that $m(U_{\alpha})\ge cm(B)$, will do.

The second topic of this post will be an exploration of the Lebesgue set, the set of all points $x$ where $f$ takes on a finite value and the limit of $\frac{1}{m(B)}\int_B|f(y)-f(x)|dy = 0$. This can be thought of as a generalization of the points of continuity to also include some other special points. Firstly we can see that for any locally integrable function, almost all points are in its Lebesgue set. For any $x,y$, we know that $|f(y)-f(x)|\le |f(y)-r| + |f(x)-r|$ where $r$ we choose to be a rational arbitrarily close to $f(x)$. Integrating the right side with respect to $y$, we know that for almost every $y$, the limit of $\frac{1}{m(B)}\int_B|f(y)-r|dy$ is $|f(x)-r|$ so that $|f(y)-f(x)|$ indeed can be made arbitrarily small as desired.

But in fact the Lebesgue set is even more interesting. First define a good kernel $K_{\delta}$ (this is Stein’s terminology) to be a function parametrized by $\delta$ such that i) it integrates to 1, ii) its absolute value integrates to at most some constant $A$ independent of $\delta$, and iii) for every $r>0$, the integral of its absolute value outside of a ball of radius $r$ tends to zero as $\delta$ does. Visually, we can think of $\delta$ as an index of “narrowness” of the graph, and we call it a kernel because everywhere that $f$ is continuous, the convolution of $f$ with a good kernel approaches $f$ as $\delta\to 0$. We can ask, is there a kernel which approximates unit masses similar to how good kernels do, but which works for any point in the function’s Lebesgue set?

Indeed, define an approximation to the identity, a special kind of good kernel, to be a function such that i) its integral is 1, ii) its absolute value is bounded by $A\delta^{-d}$ as well as $A\delta/|x|^{d+1}$ for all $\delta>0$ and $x$.

As an example, consider the functions $K_{\delta}$ with support on the range $|x|<\delta$ and equal to $\frac{1}{2\delta}$ there. They converge to a unit mass at $x=0$, integrating to one, called the Dirac delta function, and its convolution with any $f$ is $f$ because $f(x-y)D(y)=0$ almost everywhere.

Result 2: If $K_{\delta}$ is an approximation to the identity, then the convolution $f*K_{\delta}$ approaches $f$ as $\delta\to 0$ for every point in the Lebesgue set and in general for almost every point.

Note that the latter point follows from our above result that almost every point is in the Lebesgue set of a locally integrable function.

The proof actually boils down to some algebra bashing. First rewrite the difference between the convolution and $f$ as $\int(f(x-y)-f(x))K_{\delta}(y)dy\le \int|f(x-y)-f(x)||K_{\delta}(y)|$. Split the integral into integrals over the ball $|y|\le\delta$ and over successive annuli $2^k\delta<|y|\le 2^{k+1}\delta$. By the former of our two bounds on the absolute value of $K_{\delta}$, the integral over the ball is bounded above by $A\delta^{-d}\int_{|y|\le\delta}|f(x-y)-f(x)|dy$. The integral over the the $k$th annulus is bounded likewise, using the latter bound on $|K_{\delta}|$ by $A\delta(2^k\delta)^{d+1}\int_{|y|\le 2^{k+1}\delta}|f(x-y)-f(x)|dy$. The coefficient in front of the integral can be written as $A'2^{-k}(2^{k+1}\delta)^d$, where $A' = A2^d$. We do this so that we can rewrite the integral over the ball and the integral over an annulus as $cF(\delta)$ and $A'2^{-k}F(2^{k+1}\delta)$, respectively, where $F(r) = \frac{1}{r^d}\int_{|y|\le r}|f(x-y)-f(x)|dy$, which we can think of as an average deviation from $f(x)$ over the neighborhood of radius $r$ centered at $x$.

It turns out that $F$ has certain nice properties that allow us to quickly finish the proof: continuity and, in particular, boundedness as well as vanishing as the neighborhood shrinks. For the sake of preserving flow, we defer proof of continuity to the end of this post. The fact that $F$ vanishes as $r\to 0$ follows directly from the fact that $x$ lies in the Lebesgue set. Continuity and vanishing imply that around $r=0$, $F(r)$ basically doesn’t behave weirdly and is bounded. Outside of this, $F(r) \le \frac{1}{r^d}\left|\left|f\right|\right|+v_d|f(x)|<\infty$, where $v_d$ is the volume of a $d$-ball.

Returning to our proof, we can make our upper bound on the sum of the integrals over the annuli, $A'\sum^{\infty}_{k=0}2^{-k}F(2^{k+1}\delta)$, arbitrarily small because for any $\epsilon>0$, past a sufficient number $N$ of summands, $\sum^{\infty}_{k=0}2^{-k}F(2^{k+1}\delta)<\epsilon M$ where $M$ is the upper bound of $F$. For the remaining finitely many summands, we can shrink $\delta$ arbitrarily small as needed, by vanishing. We can also shrink the upper bound on the integral over the ball, $AF(\delta)$, arbitrarily small, so we are done.
As promised we must prove continuity. In fact, the continuity of $F$ follows from a more general property of all integrable functions, absolute continuity. This is the property that there exists for any $\epsilon>0$ a $\delta>0$ such that $\int_E|f|<\epsilon$ whenever the measure of $E$ is less than $\delta$. Fortunately, monotone convergence kills this: basically, approximate $f$ by copies of itself except with high parts cut off. More precisely, let $E_N$ be the set of $x$ for which $F$ stays within $N$, and let $f_N = f\chi_{E_N}$. By boundedness of the $f_N$, we can make their integral over $E$ arbitrarily small by shrinking $E$, and we can make our approximation $f_N$ arbitrarily close to $f$ by picking a high enough $N$. Then $\int_E f = \int_E(f-f_N)+\int_Ef_N$ can indeed by made arbitrarily small as desired.