# «Electronic Colloquium on Computational Complexity, Report No. 57 (2012) Pseudorandomness from Shrinkage Raghu Meka∗ Russell Impagliazzo University ...»

Electronic Colloquium on Computational Complexity, Report No. 57 (2012)

Pseudorandomness from Shrinkage

Raghu Meka∗

Russell Impagliazzo

University of California, San Diego and Institute for Advanced Study

Institute for Advanced Study

David Zuckerman†

University of Texas at Austin and

Institute for Advanced Study

Abstract

One powerful theme in complexity theory and pseudorandomness in the past few decades

has been the use lower bounds to give pseudorandom generators (PRGs). However, the general results using this hardness vs. randomness paradigm suﬀer a quantitative loss in parameters, and hence do not give nontrivial implications for models where we don’t know super-polynomial lower bounds but do know lower bounds of a ﬁxed polynomial. We show that when such lower bounds are proved using random restrictions, we can construct PRGs which are essentially best possible without in turn improving the lower bounds.

More speciﬁcally, say that a circuit family has shrinkage exponent Γ if a random restriction leaving a p fraction of variables unset shrinks the size of any circuit in the family by a factor of pΓ+o(1). Our PRG uses a seed of length s1/(Γ+1)+o(1) to fool circuits in the family of size s.

By using this generic construction, we get PRGs with polynomially small error for the following

**classes of circuits of size s and with the following seed lengths:**

1. For de Morgan formulas, seed length s1/3+o(1) ;

2. For formulas over an arbitrary basis, seed length s1/2+o(1) ;

3. For read-once de Morgan formulas, seed length s.234... ;

4. For branching programs of size s, seed length s1/2+o(1).

The previous best PRGs known for these classes used seeds of length bigger than n/2 to output n bits, and worked only when the size s = O(n) [BPW11].

∗ Supported in part by NSF grant DMS-0835373.

† Supported in part by NSF Grants CCF-0916160 and DMS-0835373.

ISSN 1433-8092 1 Introduction Two of the most important general challenges for complexity are to prove constructive lower bounds for non-uniform measures of computational complexity such as circuit size, and to show that randomized algorithms have eﬃcient deterministic simulations. The “Hardness vs. Randomness” paradigm ([BM84, Yao82, AW85, NW94, Nis91, BFNW93]) shows that these questions are linked.

More precisely, these results show how to use any problem that is hard for a class of circuits to create a pseudorandom generator (PRG) for the same class of circuits. This PRG can then be used to construct a relatively eﬃcient deterministic version of any probabilistic algorithm with a corresponding complexity. This has been used both to create unconditional PRGs for circuit classes with known lower bounds, such as AC0, and conditional results, implications between the existence of hard problems and derandomization for classes where no strong lower bounds are known. In the converse direction, it is easy to see that any PRG for a circuit class immediately gives a corresponding lower bound for the class. Somewhat more surprisingly, it has been shown that any eﬃcient deterministic simulation of some probabilistic algorithms would yield circuit lower bounds ([IKW02, KI04, Wil10]). This hardness vs. randomness connection is one of the most important tools in computational complexity. It formalizes the intuition that eﬃcient algorithms for “metacomputational problems”, where the input is a computational device from a certain class, is linked to our ability to prove lower bounds for that class.

However, being so general comes at a quantitative price. Ideally, the stretch of a PRG (the output length as a function of the input length) equals the known lower bound. However, in the hardness to randomness constructions, there are a number of stages that each lose a large polynomial factor. In particular, this means, that, for example, a quadratic or cubic circuit lower bound for a class does not immediately give any nontrivial PRG. For completely generic, “blackbox”, reductions between a hard problem and a PRG, some of these costs are inherent ([GV08, SV08, Wat11, AS11b]). In particular, this is an issue for those models where super-linear but not super-polynomial bounds are known, such as Boolean formulas.

In this work, we show a general method for obtaining tight “hardness to randomness” results from the proofs of lower bounds, rather than as a black-box consequence of the lower bounds. In particular, our methods apply to lower bound proofs that involve restricting some of the inputs to the circuit. Our construction goes in two stages. We start with a lower bound proved by the following kind of shrinkage argument: if we restrict a size s circuit leaving a p fraction of variables unset, the expected size of the restricted circuit is O(pΓ s). The best Γ for which this holds is known as the “shrinkage exponent” of the circuit class. The ﬁrst stage of our construction is to derandomize the shrinkage argument, showing that there is a distribution with similar shrinkage properties that can be sampled with few bits. This stage of our argument is general, but not totally generic. While the same general construction and analysis ideas work in a variety of models, the details depend on the model. Then we show how to go from such a distribution on restrictions to a PRG. This part is generic, being identical for all models, and is very related to the generator from [NZ96]. The total number of bits r used by the generator is roughly s1/(1+Γ) times the number of bits needed to sample from the distribution on restrictions.

Every generator using r bits to fool tests of size s = s(r) immediately gives a problem requiring size Ω(s(r)) to compute in the model. So, if our function s(r) is close to the known lower bounds, this shows that we have essentially converted all of the “hardness” in the lower bound to “randomness”.

This is indeed the case for a variety of natural models of computation. For Boolean formulas over the de Morgan basis, we give a generator with s(r) = r3−o(1), almost matching the known lower bound of s(n) = Ω(n3 / log2 n) due to H˚ astad ([H˚as98], based on earlier work by [Sub61, And87, IN93, PZ93]).

To avoid technicalities, we assume that the size s is at least the number of input variables n in the following statements (if not, one could think of adding dummy variables).

** Theorem 1.1.**

For any constant c 0, there is an explicit PRG using a seed of length s1/3 · 2/3 2O(log s) = s1/3+o(1) random bits that s−c -fools formulas of size s over the de Morgan basis.

For Boolean formulas over an arbitrary basis, our generator has stretch s(r) = r2−o(1), almost matching the Khrapchenko bound of s(n) = Ω(n2 ) ([Khr71]).

** Theorem 1.2.**

For any constant c 0, there is an explicit PRG using a seed of length s1/2 · 1/2 2O(log s) = s1/2+o(1) random bits that s−c -fools formulas of size s over an arbitrary basis.

For branching programs, with size being the total number of edges, we get a similar bound.

** Theorem 1.3.**

For any constant c 0, there is an explicit PRG using a seed of length s1/2 · 1/2 2O(log s) = s1/2+o(1) random bits that s−c -fools branching programs of size at most s.

We also consider the case of read-once formulas over the DeMorgan basis. Here, there is no sensible notion of lower bound, since all functions computable in the model have size exactly n, but the notion of shrinkage is deﬁned. The optimal shrinkage exponent for such read-once DeMorgan √ formulas was shown by [PZ93, HRY95] to be Γ = log 2/ log( 5 − 1) = 3.27...; using this result, we get a PRG with stretch s(r) = Ω(r4.27... ).

** Theorem 1.4.**

For any constant c 0, there is an explicit PRG using a seed of length s1/(Γ+1) · 2/3 2O(log s) = s1/(Γ+1)+o(1) random bits that s−c -fools read-once formulas of size s over the de Morgan √ basis, where Γ = log 2/ log( 5 − 1) = 3.27....

Any substantial improvement in our PRGs would thus yield better lower bounds than what is currently known.

Our results dramatically improve previous work. The only directly comparable PRG was by Bogdanov, Papakonstantinou, and Wan [BPW11], who constructed a PRG using a (1 − Ω(1))n bit seed to output n bits that fool read-once formulas and read-once branching programs, where the order of the bits is unknown beforehand. There has been signiﬁcant work on read-once branching programs where the order of the bits is known in advance (e.g., [Nis91, INW94, NZ96]), but that is a much simpler model and the generators of [Nis91, INW94] are known to not work if the order of bits is unknown [BPW11].

1.1 Outline of Constructions Our techniques build upon those of [Nis92, INW94, NZ96]. The intuition behind all of these PRGs is to exploit communication bottlenecks in the computation. If the random inputs to a computation can be partitioned into two subsets X and Y, and the computation can be simulated with k bits of communication between these two subsets, then given the communication pattern, the two sets of bits have high entropy conditioned on each other. Then, instead of using independent bits for the two sides, we can use a randomness extractor to convert the conditional entropy of X given the communication into random bits to be used in place of Y.

Our construction follows the same basic intuition. The key insight is that shrinkage under random restrictions is a form of communication bottleneck, between X, the set of variables with values speciﬁed by the restriction ρ, and Y, the set of variables left unrestricted (and chosen later).

Consider a one-way protocol where the player knowing X has to send a message allowing the Y player to compute the function f. What this message exactly needs to specify is the restricted function f |ρ. If the circuit size of f |ρ is small, much smaller than the size of X, the message can be the circuit computing the restricted function, showing low communication.

Most of the previous constructions were for computational models like read-once branching programs, where one had an explicit description of which sets X and Y had a communication bottleneck, and there was a hierarchical structure available on such sets so that the construction could be deﬁned recursively. Here, we do not have either one, but we know the bottleneck occurs for most sets X and their complements. Instead of explicitly partitioning the variables into blocks with low communication, we randomly sample sets that exhibit this bottleneck until all variables are covered. So far, we are not able to utilize recursion, which blocks us from making the seed size sub-polynomial (and hence proving super-polynomial lower bounds).

Derandomized Shrinkage Bounds. To use our main generator construction, we need a family of random restrictions that can be sampled with few random bits, and still causes the circuits to shrink. For branching programs and formulas over an arbitrary basis (shrinkage exponent Γ = 1) these are not too hard to get by taking O(log n)-wise independent random restrictions. For formulas over the de Morgan basis and read-once formulas getting such restrictions is far trickier.

The ﬁrst diﬃculty we face is that H˚ astad’s original proof [H˚ as98] only shows shrinkage in expectation and does not give a high probability bound for the formulas to shrink. We get around this diﬃculty as follows. Let f be a formula over the DeMorgan basis. We ﬁrst show shrinkage under restrictions for which the probability of being unset p = n−α for some α = o(1) and have k = no(1) -wise independence. By repeating this process independently, we get shrinkage for all values of p (both in the known lower bounds and in our PRG construction we need p ∼ n1/(Γ+1) ).

To do this, we decompose the target formula f into O(n/ ) subformulas gi of size at most, for a suitable k. Since each gi now has size at most k, the behavior of gi under restrictions should be the same under k-wise independent restrictions or truly random restrictions. Thus, we can roughly expect each gi to shrink by pΓ in expectation.

For read-once formulas, the events that the diﬀerent gi shrink are independent, and hence by a Chernoﬀ bound with high probability the total shrinkage is as promised. For the read-t case (each variable appears at most t times in the formula), we partition the subformulas into t + 1 color classes, such that within a color class the subformulas are on disjoint sets of variables. We can then proceed as in the read-once case. For the general case, we condition on heavy variables (the ones that appear many times) in a subtle way and reduce to the read-t case.

2 Preliminaries We start with some deﬁnitions and notations.

• For a restriction ρ ∈ {0, 1, ∗}n, let the set of active variables be A(ρ) = {i : ρi = ∗}.

• For ρ ∈ {0, 1, ∗}n and f : {0, 1}n → {0, 1}, deﬁne the ρ-restricted function fρ : {0, 1}A(ρ) → {0, 1} by fρ (y) = f (x), where x ∈ {0, 1}n satisﬁes xi = yi, i ∈ A(ρ) and xi = ρi otherwise.

• Call a distribution D on {0, 1, ∗}n p-regular if for every i ∈ [n], Prρ←D [ρi = ∗] = p. We say D is k-wise independent if any k coordinates of D are independent. There exist explicit k-wise independent distributions samplable with O (k(log n)(log p)) random bits [AS11a].

• For a class of functions F on n-variables we say s : F → N \ [n] is a size function if |{f ∈ F :

s(f ) ≤ m}| ≤ mO(m) for m ≥ n. By default, we shall assume that F is closed under negating the input variables and for any f ∈ F, s(g) = O(s(f )) if g is obtained from f by negating some input variables. We also assume that any f ∈ F depends on at most s(f ) variables.

• We say two distributions D, D (on the same universe) are ε-close if the statistical distance between D, D is at most ε.

Similarly, we say G δ-fools a class of functions F if G δ-fools all elements of F. We refer to the parameter r as the seed-length of the generator G and say G is explicit if G can be computed in poly(n, 1/δ) time.

• Throughout, we use upper case letters for random variables and lower case for constants.