The following series of posts will document things that I’ve learned from a neat survey on pseudorandomness by Prof. Salil Vadhan at Harvard. They’re mostly to help me digest the material, but I hope they’ll be of interest to anyone reading this blog.
Specifically, this first set of two posts will be about the power of expander graphs in randomness-efficient error reduction in randomized algorithms; this first post will present two naive approaches to error reduction. We begin with the following definition.
Definition 1: A language lies in class BPP (bounded-error probabilistic polynomial-time) if there exists a probabilistic polynomial-time algorithm such that the probability that accepts an element x is if and is if , where is some polynomial in the length of the input. Stated another way, BPP is the class of languages with probabilistic polynomial-time algorithms with two-sided error at most .
Intuitively, taking multiple repetitions should push the error in any such arbitrarily low, thanks to the following standard result from probability theory:
Result 1 ((A) Chernoff Bound): Let be independent random variables from . Let be their average , and let be the expected value of . Then , that is, the probability of the average differing from its expected value by some quantity falls exponentially in the number of repetitions .
For an algorithm with error at most , consider the corresponding algorithm which runs for repetitions. Let be the average of the indicator variables which each take on a value of 1 if the algorithm outputs the right answer on the th repetition and 0 otherwise. By the Chernoff bound above for , we get that . Taking to be , we get that the probability is bounded above by as long as is large enough. Here, reducing error to comes at a cost of k repetitions and, if our algorithm requires random bits, requires random bits, because our repetitions are completely independent.
The second approach to error-efficient makes a tradeoff between time and randomness by loosening the requirement of strict independence into one of pairwise independence.
Observation 1: If are independent random bits, then for each nonempty set of indices , let denote the XOR of all where . These random variables are pairwise independent. In other words, it takes only independent random bits to generate pairwise independent bits.
Definition 2: A family of (hash) functions , where denotes the set , is said to be pairwise independent if for all distinct , distinct , and in our family of functions, .
Result 2: For all , we can construct a family of pairwise independent hash functions where we can pick a random function from this family using independent random bits.
Proof: Indeed, for any finite field , we can construct a family of pairwise independent hash functions given by .
By taking to be , we can construct a family of pairwise independent hash functions . Choosing a function from this family requires random bits for both of and .
To construct a family of hash functions where input length exceeds output length , restrict the output of each function from to the first $n$ bits; this requires random random bits for the family . In fact, we can do better: rather than restricting the output of functions , i.e. of the form for , restrict functions of the form where and . This requires only random bits.
To construct where , pad the input with extra zeroes so that for each function , we can define a function ; this requires random bits for the family , completing the proof.
Before we discuss how this means we require fewer purely independent random bits, we use the following pairwise (and thus slightly weaker) equivalent of the Chernoff bound, which follows from a straightforward application of Chebyshev’s, to find the time price we need to pay for using fewer random bits.
Result 3: Let be pairwise independent random variables from . Let be their average , and let be the expected value of . Then , that is, the probability of the average differing from its expected value by some quantity falls linearly with respect to the number of repetitions .
What do we get out of all this with regards to error reduction? If we already have a BPP algorithm that requires repetitions with random bits per repetition to reduce error to , we can get out another BPP algorithm in which we first select a function from using and then, because Result 3 tells us we now need repetitions as opposed to , run this hash function over all inputs from to achieve the same amount of error reduction.
In other words, while we’ve reduced the number of random bits needed from in the case of purely independent repetitions, to in the case of pairwise independent repetitions, we’ve bumped the number of repetitions needed from to .
In the next post in this series, we’ll discuss how expander graphs do away with this tradeoff by taking time comparable to independent repetitions and randomness comparable to pairwise independent repetitions.