# «Game interpretation of Kolmogorov complexity Andrej A. Muchnik∗ Abstract The Kolmogorov complexity function K can be relativized using any oracle ...»

Game interpretation of Kolmogorov complexity

Andrej A. Muchnik∗

Abstract

The Kolmogorov complexity function K can be relativized using any oracle A, and

most properties of K remain true for relativized versions K A. We provide an explanation

for this observation by giving a game-theoretic interpretation and showing that all “natural”

properties are either true for all K A or false for all K A if we restrict ourselves to sufﬁciently

powerful oracles A.

1 Main theorem and its proof Consider all functions deﬁned on the set of binary strings with non-negative integer values, ∗ i.e., the set F = N{0,1}. Let α be a property of such a function (i.e., a subset of F ). We say that α is O(1)-stable if f1 ∈ F ⇔ f2 ∈ F for any two functions f1, f2 ∈ F such that f1 (x) = f2 (x) + O(1), i.e., the difference | f1 (x) − f2 (x)| is bounded.

Let A be an oracle (a set of strings). By K A (x) we denote the Kolmogorov complexity of a string x relativized to oracle A, i.e., the length of the shortest description for x if the decompressor is allowed to use A as an oracle. (See [2] or [7] for more details; we may use either plain complexity (denoted usually by C or KS) or preﬁx complexity (denoted usually by K or KP) though the game interpretation would be slightly different; see below).

For a given A the function K A is deﬁned up to O(1) additive term, therefore an O(1)-stable property α is well deﬁned for K A (does not depend on the speciﬁc version of K A ). So α(K A ) becomes a property of the oracle A. It may be true for some oracles and false for other ones.

For example, if wn is a n-bit preﬁx of Chaitin’s random real Ω, the (O(1)-stable) property K A (wn ) 0.5n + O(1) is true for trivial oracle A = 0 and false for A = 0. The following result shows that for “usual” α the property α(K A ) is either true for all sufﬁciently large A or false for all sufﬁciently large A.

Theorem. Let α be a Borel property. Then there exists an oracle A0 such that either α(K A ) is true for all A ≥T A0 or α(K A ) is false for all A ≥T A0.

Here ≥T stands for Turing reducibility. The statement is true for different versions of complexity (plain complexity, preﬁx complexity, decision complexity, a prioriy complexity, monotone complexity etc.). We provide the proof for plain complexity C and there describe the changes needed for other versions.

∗ The paper is based on talks given by Andrej Muchnik (24.02.1958–18.03.2007) at Kolmogorov seminar (Moscow Univerisity). The text is prepared by A. Shen and N. Vereshchagin who are responsible for all errors and omissions.

Proof. Consider the following inﬁnite game with full information. Two players called (as usual) Aliceand Bob enumerate graphs of two functions A and B respectively; arguments and values of A and B are binary strings. The players’ moves alternate; at each move player may add ﬁnitely many pairs to the graph of her/his function (but cannot delete the pairs that are already there, so the values of A and B that are already deﬁned remain unchanged).

The winner is declared as follows. Let KA and KB be the complexity functions that correspond to decompressors A and B, i.e., KA (x) = min{l(p) | A(p) = x} where l(p) stands for the length of p; the function KB is deﬁned in a similar way. Let us agree that Alice wins if the function K(x) = min(KA (x), KB (x)) satisﬁes α. If not, Bob wins. (A technical correction: functions KA and KB may have inﬁnite values; we assume that α is somehow extended to such functions, e.g., is false for all functions with inﬁnite values.) Lemma. If Alice has a computable winning strategy in this game, then α(C) is true for (plain) complexity function C; if Bob has a computable winning strategy, then α(C) is false.

Proof of the Lemma is straightforward. Assume that Alice has a computable winning strategy. Let her use this strategy against the enumeration of the graph of optimal decompressor function (so KB (x) = C(x) for all x). Note that in fact Bob ignores the moves of Alice and enumerates the graph of B at its own pace. Since both playes use computable strategies, the game is computable. Therefore KA ≤ KB + O(1) due to the optimality of B and min(KA (x), KB (x)) = KB (x) + O(1) = C(x) + O(1). Since Alice wins and α is O(1)-stable, the function C has property α. The same argument (with exchanged roles of Alice and Bob) can be used if Bob has a winning strategy.

The statement and the proof of the lemma can be relativized: if Alice/Bob has a winning strategy that is A-computable for some oracle A, then α(CA ) is true/false.

Now recall Martin’s theorem on the determinacy of Borel games: the winning condition of the game described is a Borel set (since α has this property), so either Alice or Bob has a winning strategy in the game. So if the oracle A is powerful enough (is above the strategy in the hiearchy of T -degrees), the property α(K A ) is true (if Alice has a winning A-computable strategy) or false (if Bob has a winning A-computable strategy). Theorem is proven.

2 Discussion Let us make several remarks.

• First, note that not all theorems in algorithmic information theory are O(1)-stable. For example, most of the results about algorithmic properties of complexity function are not stable. (The non-computablity of the complexity function or its upper semicomputablity are not stable while the non-existence of nontrivial computable lower bound is stable. Also the Turingcompleteness of C is a non-stable assertion though the stronger claim “any function that is O(1)-close to C can be used as an oracle to decide halting problem” is stable.) The other assumption (Borel property) seems less restrictive: it is hard to imagine a theorem about Kolmogorov complexity where the property in question won’t be a Borel one by construction.

• One may ask whether the statement of our theorem can be used as a practical tool to prove the properties of Kolmogorov complexity. The answer is yes and no at the same time.

Indeed, it is convenient to use some kind of game while proving results about Kolmogorov complexity, and usually the argument goes in the same way: we let the winning strategy play against the “default” strategy of the opponent and the fact that the winning strategy wins implies the statement in question. However, it is convenient to consider more special games. For example, proving the inequality C(x, y) ≥ KS(x) + C(y|x) − O(log n) (for strings x and y of length at most n), we would consider a game where Alice wins if KB (x, y) k + l implies that either KA (x) k + O(log n) or KA (y|x) l + O(log n) for every n, k, l and for every strings x, y of length at most n.

This example motivates the following version of the main theorem. Let α be a property of two functions in F, i.e., a subset of F × F. Assume that α is monotone in the following sense: if α( f, g) is true and if f (x) ≤ f (x) + O(1) and g (x) ≥ g(x) − O(1), then α( f, g ) is true, too. Consider the version of the game when Alice wins if α(KA, KB ) is true. If Alice has a computable winning strategy, then α(C, C) is true; if Bob has a computable winning strategy, then α(C, C) is false. (The proof remains essentially the same.) We provide several examples where game interpretation is used to prove statements about Kolmogorov complexity in Section 3; one more example can be found in [6].

• Going in the other direction, one would like to extend this result to arbitrary results of computablility theory not necessarily related to Kolmogorov complexity. Such an extension is given in [5].

• It is easy to modify the proof to cover different versions of Kolmogorov complexity. For example, for preﬁx complexity we may consider preﬁx-stable decompressors where F(p) = x implies F(p ) = x for every p that has preﬁx p; similar modiﬁcation work for monotone and decision complexity. For a priori complexity the players specify lower approximations to a semimeasure.

• One may change the rules of the game and let Alice and Bob directly provide upper bounds KA and KB instead of enumerating graphs for A and B. Initially KA(x) = KB(x) = +∞ for every x; at each step the player may decrease ﬁnitely many values of the corresponding function. The restriction (that goes back to Levin [1]) is that for every n there is at most 2n strings x such that KA(x) n (the same restriction for KB). This approach works for preﬁx and decision complexities (but not for the monotone one).

3 Examples Conditional complexity and total programs Let x and y be two strings. The conditional complexity C(x|y) of x when y is known can be dened as the length of the shortest program that transforms y into x (assuming the programming language is optimal). What if we require this program be total (i.e., deﬁned everywhere)?

It turns out that this requirement can change the situation drastically: there exist two strings x and y of length n such that C(x|y) = O(log n) but any total program that transforms y to x has complexity (and length) n − O(log n). (Note that a total program that maps everything to x has complexity at most n + O(1).) To prove this statement, we use the following game. Fix some n and consider a game.

We enumerate a graph of some function f : Bn → Bn (at each move we add some pairs to that graph). The opponent enumerates a list of at most 2n − 1 total functions g1, g2,... (at each move opponent may add some functions to this list). We win the game if there exist strings x, y ∈ Bn such that f (y) = x but gi (y) = x for all i.

Why we can win in this game: First we choose some x and y and declare that f (y) = x.

After every move of the opponent we choose some y where f is still undeﬁned and declare f (y) = x where x is different from currently known g1 (y), g2 (y),.... The number of opponent’s moves is less than 2n, therefore an unused y still exists (we use only one point for every move of the opponent) and a value x different from all gi (y) exists.

**Why the statement is true: Let us use our strategy against the following opponent strategy:**

enumerate all total functions Bn → Bn that have complexity less than n. (Each function is considered here as a list of its values.) This strategy is computable (given n) and therefore the game is computable. Therefore, for the winning pair (x, y) we have C(x|y) = O(log n) since n is enough to describe the process and therefore to compute function f. On the other hand, any total function that maps y to x has complexity n − O(log n), otherwise the list of its values would appear in the enumeration.

So if we denote by C(x|y) the length of the shortest program for a total function that maps x to y, we get a (non-computable) upper bound for C(x|y) that sometimes differs signiﬁcantly from C: it is possible that C(x|y) is about n while C is O(log n) (for strings x and y of length n).

Extracting randomness requires Ω(log n) additional bits Let us consider a question that can be considered as Kolmogorov-complexity version of randomness extraction (though the similarity is superﬁcial). Assume that a string x is “weakly random” in the following sense: its complexity is high (at least n) but still can be smaller than its length, which is polynomial in n. We want to “extract” randomness out of x, i.e., to get a string y such that y is random (=incompressible: its length is close to its complexity) using few additional bits, i.e., C(y|x) should be small. When is it possible?

The natural approach is to take the shortest program for x as y. Then x is indeed incompressible (C(y) = l(y) + O(1); here l(y) stands for the length of y). And the complexity of y when x is known is O(log n): knowing x and the length of a shortest program for x, we can ﬁnd this shortest program. Taking the ﬁrst n bit of this shortest program, we get a string of length n, complexity n + O(log n) and O(log n) conditional complexity (relative to x).

What if we put a stronger requirement and requiere C(y|x) to be O(1) or o(log n)? It turns that “randomness extraction” in this stronger sense is not always possible: there exists a string x of length n2 that has complexity at least n such that every string y of length n that has conditional complexity C(y|x) less than 0.5 log n has unconditional complexity O(log n) (i.e., is highly compressible). (The same is true for all strings y of length less than n, so we cannot extract even n/2 “good random bits” using o(log n) advice bits.) To prove this statement, consider the following game. There are two sets L = Bn (“left part”) and R = Bn (“right part”). The opponent at each move may choose some elements l ∈ L and r ∈ R and add an edge between them (declaring l to be a neighbor of r). The restriction is √ that every element in R should have at most d = n neighbors. We may mark some elements √ of L (at most O( n) elements) as “simple”. We win if there are at least 2n elements in R that have the following property: all their neighbors are marked.

Why the statement is true if we can win the game (using a computable strategy): let the opponent declare x ∈ L to be a neighbor of y ∈ R if C(x|y) 0.5 log n. Then every y has at most d neighbors. The process is computable, so the game can be effectively simulated. Therefore, all x declared as “simple” indeed have complexity O(log n) since each x can be described by n and its ordinal number in the enumeration of simple elements (the latter requires 0.5 log n bits).

Among 2n elements in R that have winning property there is one that has complexity at least n, and this is exactly what we claimed.

How to win the game. We do nothing while there are at least 2n elements in R that have no neighbors (since this implies the required property). After 2n − 2n elements get neighbors in L, we mark the neighbor that is used most often. It is a neighbor of at least (2n − 2n )/2n = 2n −n − 1 2n −2n elements in R, and restrict our attention to these “selected” elements ignoring all other elements of R. Then we do nothing while more than 2n of (selected) elements have no second neighbor. After that we mark the most used second neighbor and have at least (2n2 − 2n − 2n )/2n 2n −4n elements that have two marked neighbors. In this way we either wait indeﬁnitely at some step (and in this case we have at least 2n elements that have only marked neighbors) or ﬁnally get 2n −2dn 2n element who have d marked neighbors and therefore cannot have non-marked ones.