(This is a continuation of a previous post.)

We begin by building up some of the canonical technology of **expander graphs**. ** **Morally, we can think of an expander graph as a building block characterized paradoxically by connectivity and sparseness, out of which we can construct families of increasingly larger graphs increasing in the number of vertices but barely growing in degree, i.e. by taking some combination of graph powers, tensor products, zigzag products, etc. Here are a few measures of the connectivity characteristic of expander graphs:

**Definition 1 (vertex expansion): **A directed multigraph (the word “graph” will be used to denote these from now on unless otherwise stated) is a **-vertex expander** if all sets of at most vertices have at least neighbors. In other words, all small enough collections of vertices are well connected to the entire graph.

**Definition 2 (spectral expansion)**: A regular graph is a **-spectral expander** if the spectral gap of its Laplacian is at least . Morally, the algebraic “mixing” of is sufficiently fast.

**Definition 3 (edge expansion)**: For subsets of the vertices of a graph , let denote the cut size between and , i.e. the number of directed edges from vertices in to vertices in . A -regular graph is a -edge expander if for all such that .

In other words, all small enough collections of vertices have lots of edges emanating from them. It turns out that these three criteria for expansion are closely related (the following results aren’t immediately relevant to error reduction, but as a newbie learning about expander graphs for the first time, I think they form a neat connection between the algebraic and the combinatorial).

**Result 1: **If is a -spectral expander with vertices, it is also an -vertex expander.

**Proof**: Let denote the Laplacian of and the uniform distribution on a given set of vertices . The key to this proof is that we can express the size of in terms of the collision probability of (i.e. the probability that two independent samples from are the same) and the neighborhood size in terms of the size of the support of (i.e. the set of nonzero coordinates of – note that this notation is nonstandard and only used for typographical simplicity). We can quickly verify that for any probability distribution , i) , and ii) Cauchy-Schwarz gives that . Fact i) tells us that

,

and because, as we mentioned above, , we can use fact ii) and the aforementioned observation that to see that our above inequality becomes

,

and solving for an appropriate lower bound on gives the desired result.

**Result 2 (Alon)**: If is a -regular -vertex expander, then is a -spectral expander.

**Proof**: We defer this to the next post.

Before we state the connection between edge expansion and vertex/spectral expansion, we introduce a famously useful result, the Expander Mixing Lemma.

**Result 3: **For a -regular -spectral expander of vertices, then for any sets of densities and , we have

**Proof**: This neat result follows from a bit of computation after observing that if and are the characteristic row vectors for and and is the adjacency matrix of , then (the th coordinate of gives the number of vertices in adjacent to vertex , so taking the dot product of this with gives the number of edges from to ).

Denoting to be the uniform distribution, we break up and into orthogonal components and for some and . Substituting these into our expression for in terms of the characteristic row vectors of , we can get rid of cross terms by orthogonality (we leave these manipulations as an exercise), ending up with

,

where denotes the Laplacian of . By definition of spectral conclusion, we get that and we can rewrite the norms of the two characteristic vectors using the Pythagorean theorem to get the desired inequality.

Observe that by taking to be and the lower bound of to be , we get that every -regular -spectral expander of is an edge expander.

The intuition behind the Expander Mixing Lemma is that the fraction of all directed edges in the graph that are going from a set of vertices to another set should be roughly the product of the densities of those two sets, suggesting that perhaps it’s good to think of expander graphs as random graphs in the following sense:

**Result 4 (Pinsker)**: For any , there is some so that a random -regular undirected graph with vertices is an -vertex expander at least half the time.

**Proof**: Most sources online only give the proof for the simple case in which the graph is -leftregular bipartite. We will present Pinsker’s original proof for the general case in a future post. For now, again, this result isn’t immediately relevant to random-efficient error reduction.

Now let’s move on to how to achieve random-efficient error reduction using expander graphs. We will get the randomness needed in our BPP algorithm out of a random walk on the expander graph; morally the good mixing property of spectral expanders gives us a probability distribution that converges pretty nicely to a uniform one, so by taking the nodes visited in our walk to be the “coin tosses” in our BPP algorithm, it turns out we can get both the random-efficiency of pairwise independent sampling and the time-efficiency of purely independent sampling. Specifically, we have the following Chernoff bound for random walks on expander graphs:

**Result 5 ((Another) Chernoff Bound)**: If is a regular -spectral expander with vertices and is the function we are approximating, then for a random walk where is uniformly randomly chosen,

where denotes the expected value of . In other words, if is sufficiently small (e.g. by taking to a sufficiently large power), then the probability that our approximation has error at least decreases exponentially in the number of repetitions .

Before we prove this, we give a convenient lemma that morally says that at each step of our random walk, we are jumping anywhere at random on the graph with probability and jumping somewhere that won’t change the current probability distribution too much.

**Lemma**: We can break up the Laplacian of a regular -spectral expander into , where is the Laplacian of the complete -graph and is an “error” matrix with norm bounded above by 1.

**Proof**: As usual, let denote . We need to show that . Every vector we can write as a linear combination of uniform distribution and a vector perpendicular to , so it suffices to show that right-multiplication of or any such orthogonal vector by preserves orthogonality and does not increase norm. Firstly, . On the other hand, regularity of tells us that the sum of the coordinates of remains zero so that is still orthogonal to . On the other hand, , which has norm bounded above by 1 by definition of spectral expansion, completing our proof of the lemma.

Returning to our proof of the Chernoff bound for expander walks, we try to bound the moment-generating function for and then use Markov’s inequality, though this is a bit hairy because our random variables aren’t independent… We bound we construct as follows:

Let be the diagonal matrix with entries . We can verify with a simple inductive argument that equals the sum of the coordinates of , and because is regular, this is the same as the sum of the coordinates of , which we can bound above by using Cauchy-Schwarz, giving that .

To bound , decompose it using the above lemma into . A bit of computation and Taylor expansion gives us that , while the fact that gives us that . Some more Taylor expansion after plugging these inequalities into our expression for the moment-generating function gives us that . Markov’s inequality tells us that

For chosen to be a sufficiently small fraction of , we get an upper bound of . By symmetry, if we had taken our function to be instead, we would get that , and we’re done.

The Chernoff bound on expander walks tells us that a random walk of length suffices to cut error to . More precisely, if we have a BPP algorithm requiring repetitions with random bits per repetition, we can now pay the per-repetition cost of random bits in picking the first vertex of the walk and then pay a constant number of bits (depending on the degree of the expander) per subsequent step of the walk, meaning that random-efficient error reduction only requires repetitions and random bits!

I’ll tie up some loose ends in the next two posts, including the results due respectively to Alon and Pinsker that 1) edge expansion implies spectral expansion and 2) (general) random regular graphs are good expanders.

EDIT: I got a bit ahead of myself and went ahead with blogging about later chapters of the survey instead of proving Alon’s and Pinsker’s results, but I promise I’ll tie up all loose ends soon!